Control Data Corp’s (CDC) first supercomputer, the CDC 6600, operated at a speed of three megaflops (106 floating-point operations per second). A half century on, our most powerful supercomputers are a billion times faster. But even that impressive mark will inevitably fall. Engineers are eyeing an exaflop (1018 flops)—and some think they’ll get there by 2018.

What’s so special about an exaflop? Other than the fact it’s a quintillion operations a second? Simple. We can always use more computing power.

Supercomputers enable scientists to model nature—protein folding, the Big Bang, Earth’s climate—as never before. China’s Tianhe-1A (2.57 petaflops) recently ran a 110 billion atom model through 500,000 steps. The model was a mere 0.116 nanoseconds in real time, and it took the machine three hours to complete.

Yet even the simplest natural systems have vastly more particles playing out over vastly greater timescales. There are roughly as many molecules in ten drops of water as there are stars in the universe.

So, while a petaflop is good, an exaflop is better.

Further, Henry Markram’s Blue Brain Project estimates a full simulation of the human brain would require about an exaflop. Might insights gleaned from such a simulation lead to breakthroughs in AI? Maybe. (See here for more on that debate.)

Whether it leads to a breakthrough in AI, or a deeper understanding of the human brain, or is just a killer scientific model-maker, the first exaflop machine will be a data-processing beast. And world powers are gunning for it.

International competition for the top spot is as tight as it’s ever been. China knocked IBM’s Jaguar off the top of the pile with their Tianhe-1A in 2010 (2.57 petaflops). Then it was Japan’s turn to lead the pack with their K computer in 2011 (10.5 petaflops). And the US retook the lead with IBM’s Sequoia in 2012 (16.3 petaflops).

The pace is blistering. Today’s top speed (16.3 petaflops) is 16 times faster than its counterpart four years ago (1.04 petaflops). And Oak Ridge National Laboratory is converting its ex-champion Cray Jaguar into the 20-petaflop Titan (operational later this year). It’s believed Titan’s capacity will be upwards of 35 petaflops.

But even at 35 petaflops, an exaflop (1,000 petaflops) seems distant. Is 2018 a realistic expectation? Sure, it’s plausible. It took 21 years to go from megaflops in 1964 (CDC 6600) to gigaflops in 1985 (Cray 2). But only 11 years to break the teraflop barrier in 1996 (ASCI Red). And just 12 years to enter petaflop territory in 2008 (Roadrunner).

Clocking an exaflop by 2018 would be a decade’s development—a record pace, but not too far outside the realm of reason. The below chart maps supercomputers as long as they’ve been officially ranked by Top500. Today’s pace puts processing power within range of an exaflop by 2018.

But if supercomputer speed can continue increasing at the current pace is debatable.

“The laws of physics are hunting us down,” says Mike McCoy of Lawrence Livermore Natinal Laboratory. “One of the things that make processors work faster is increasing the frequency of the processors. We found that we can’t increase the frequency like we used to simply because the amount of heat generated would melt the computer.”

So, if you can’t make the parts faster, use more of them, right? You bet. The fastest computer in the world, IBM’s Sequoia, packs 1.6 million processors.

The problem is energy consumption increases in lockstep with size, lacking corresponding efficiency gains. Sequoia operates on an average six to seven megawatts; each of its 96 racks radiates enough heat to power 50 single-family homes; and the system requires 3,000 gallons of water a minute to carry all that heat away.

Apart from massive energy requirements, engineers can’t just keep adding processors indefinitely.The more cores they add, the more difficult it is to synchronize them all. At some point, scaling up further will realize diminishing returns.

That’s why the US Defense Advanced Research Projects Agency (DARPA) is funding a project titled the Power Efficiency Revolution for Embedded Computer Technologies (PERFECT). PERFECT will explore alternative technologies to increase processor efficiency. Two such technologies are already in development.

The first approach increases overall efficiency by parceling out special duties (e.g., graphics) to specialized processors (GPUs or graphics processing units). It’s called “massive heterogeneous processing concurrency.” And in fact, Titan will make use of this approach. The second idea addresses power. Near threshold voltage (NTV) tech significantly lowers operating voltage to make a more energy efficient chip.

Both approaches are yet young and have their obstacles. It’s difficult to evenly spread work across large numbers of specialized chips. And chips operating at lower voltages flirt with the transistors’ on/off point, making it paramount to precisely control current leakage—a difficult thing to do.

Nevertheless, Oak Ridge Laboratories, home to the soon-operational 20-gigaflop Titan, is optimistic. Oak Ridge engineers foresee “two systems beyond Titan to achieve exascale performance by about 2018.” The first will be a 200-petaflop prototype, using exascale technologies. The second will be the real deal—an exaflop behemoth.

And keep in mind, even as the boundaries at the top are pushed, the speeds already broken are more commonly reached. Roadrunner was the world’s only petaflop machine in 2008. As of June 2012, there are 20 computers operating at a petaflop or more. You can still do a lot with all that power. IBM’s Jeopardy! champ, Watson, operates at a mere 80 teraflops, yet, with some ingenious software it defeated humans at their own game.

Despite the challenges, exascale computing seems attainable at some point in the next ten or fifteen years. How much further we go will depend on fundamentally new innovations. But when hasn’t that been the case? Human ingenuity is forever making the impossible possible.