It was by all accounts a small problem, a little overheating last Monday in the electronic jungle that is the Technology Command Center for Delta Airlines at its Atlanta headquarters. This minor overheating event — okay, “fire” if you insist — caused a nearby voltage-control module to spasm and allow a surge to hit a transformer, which immediately shut down the power supply. No worries, there’s an app for that. It’s called a switchgear, and its job is to sense a power failure and immediately switch the circuit to a backup power source.

The switchgear didn’t work.

Instantly, much of the computer network with which the world’s largest airline tracks and controls its planes, employees, and ticketed passengers worldwide, crashed. Airplanes on the ground were stopped in place. Aircraft in the air landed at their destination, and parked. A thousand flights had to be cancelled, tens of thousands of passengers were stranded in parked airplanes and airports. Another 500 flights were cancelled on Tuesday, and the airline continued five days later to struggle toward normalcy.

This is hardly an unprecedented event:

In July, Southwest Airlines lost its network for 12 hours and had to cancel 2,300 flights over four days. The failure of a single router brought the system down, and it took 12 hours to reboot it.

In September of 2015, an American Airlines system glitch stopped its flights to and from its hubs at Chicago, Dallas and Miami.

In April of 2013, a national computer outage at American Airlines wiped out a third of its scheduled flights.

August, 2012 – United Airlines experienced a two-hour crash of its computer systems that affected 10 per cent of its flights.

Note that the frequency of these events has gone from fewer than one a year (remember that each event costs the airline tens of millions of dollars and the passengers — well, who knows what it costs the passengers?) to two so far this year. Does anybody know why? Of course. Everybody involved knows why:

Each airline’s system was built in the 1990s. One of the basic assumptions was that it could be shut down at night for repairs and maintenance. Now the systems are global and it’s always daytime somewhere. The system is like an airplane that can’t land, and has to be fixed and maintained in flight.

Since then, the numbers of passengers and aircraft have grown exponentially. Patches and add-ons have been required so that the software can contain, search, sort and otherwise manipulate ever larger masses of data.

There have been numerous mergers , requiring that individual, proprietary systems be modified so that they communicate and work with each other.

Outside systems and networks — Internet travel agents and ticket sellers, for example, have demanded access to the airlines’ systems. Adding them to an oversized network while maintaining security has not been easy.

After 20 years of slapdash growth, Frankenstein grafts, temporary fixes, plug-ins, add-ons and extension cords, each airline has a system that is too big to fix, and increasingly prone to fail. You can’t fix it because it has to keep going, and it can’t keep going unless you fix it. You can’t afford to fix it, and you can’t afford not to.

This is Stage Four Technology Cancer, and it’s not affecting just the computer reservation and scheduling systems. A recent survey of maintenance personnel for the South American Airline LATAM revealed increasing worry about the effects of cuts in the numbers and qualifications of maintenance personnel, to save money. Giving credence to those worries, another study shows that since 2010, fewer than two commercial airliners per year have crashed worldwide. That was a worse record than that of the previous several years. So far in 2016, three airliners have crashed. The numbers are small so far, but the trend is in the wrong direction.

This is what technology cancer does; it grows without restraint until it threatens the survival of its host. If its host is the sort who refuses to have surgery because it’s too inconvenient, death ensues.