The US’s Defense Advanced Research Projects Agency (DARPA) announced a new initiative this week aimed at finding solutions to the major problems challenging computer scalability. The program — dubbed the Power Efficiency Revolution for Embedded Computing Technologies, or PERFECT, will kick off with a workshop in Arlington, Virginia on February 15.

The endeavor has been a long time coming. Back in 2007, DARPA commissioned a study on whether or not it would be possible to build an exascale computer by 2015 and the challenges such a system would face. The results were not encouraging. The 297-page report explored the question from every imaginable angle and included extensive discussion on non-standard approaches to the problem. The research group concluded that “while an Exaflop per second system is possible (at around 67MW [megawatts]), one that is under 20MW is not. Projections from today’s supercomputers… are off by up to three orders of magnitude.”



Exascale scaling issues — The gap between DARPA’s best-case projections for a number of approaches to the problem and the performance target is significant

The problems facing exascale computing can be grouped into two (very) general categories — power consumption and efficiency scaling. For decades, Dennard Scaling allowed manufacturers like Intel to significantly increase clock speed generation after generation while holding power consumption relatively steady. This came screeching to a halt in 2005. “That expected increase in processing performance is at an end,” said DARPA Director Regina E. Dugan. “Clock speeds are being limited by power constraints. Power efficiency has become the Achilles Heel of increased computational capability.”

Adding more cores might have solved the problem in the short-term for the consumer market, but it doesn’t work when trying to build out exascale systems. The more cores in a supercomputer, the greater the chances that cross-node communication latencies and systemic inefficiencies will sabotage performance of the entire system. Meanwhile the additional cores and their associated DRAM banks, routers, HDDs, and motherboards all consume more power.

PERFECT’s goal is to revolutionize processor efficiency by exploring concepts like near threshold voltage operation and “massive heterogeneous processing concurrency.” The second concept refers to the fact that a number of supercomputers now incorporate specialized additional processors. To date, this has mostly meant Nvidia GPUs, but the debut of Intel’s Knights Corner later in 2012 means we’ll soon see a greater range of co-processors. While the potential speed increase from using specialized hardware is enormous, ensuring that workloads are distributed efficiently across thousands of processors is a significant challenge.

Near threshold voltage research got a major visibility boost at IDF this past fall when Justin Rattner, the director of Intel Labs, demonstrated a 32-bit CPU capable of running on solar power.

The goal of NTV is to operate a chip using between 400-500mV — much lower than current operating voltages. This is difficult because it approaches the minimum point at which silicon transistors are either on or off — at voltages this low, controlling leakage current is a major concern.

The chip Intel demoed at IDF, codenamed Claremont, is based on the company’s original Pentium and consumes as little as 10mW. Intel hasn’t revealed much in the way of specifics on the chip’s construction, but the company’s blog post states that NTV chips are extremely sensitive to power supply and voltage fluctuations. Building Claremont required the company to redesign the on-chip caches/logic, and incorporate new circuit designs.

NTV could potentially solve the scaling problems that plague the semiconductor industry today, but the fact that it took Intel several years to build an LTV variant of a well-known, very simple CPU is evidence that we won’t see the technology galloping over the hill any time soon. It may be telling that Intel chose to debut an NTV variant of a simple, low-power processor that shares more in common with Atom than with any modern Xeon. It’s entirely possible that the technologies Intel uses to reduce current leakage and minimize voltage variance are at odds with high-performance microprocessors.

DARPA’s new program is a good start, but the challenges facing exascale computing aren’t unique to any particular architecture or system design; they’re fundamental problems related to manufacturing scaling. As such, they aren’t going to be resolved any time soon. That we will hit exascale computing is a given — but whether we’ll be able to move much beyond it is a legitimate question.