Climate researchers at the Lawrence Berkeley National Laboratory recently set themselves the task of building a supercomputer that can model the Earth's entire atmosphere at an unprecedented level of resolution. What they came up with initially was a set of specs that, at 10 petaflops and 100 terabytes of memory, wasn't just beyond their monetary budget—it was beyond their power budget as well. So they turned to embedded systems maker Tensilica to help them design a novel supercomputer that could get them to their goal on the cheap, and without breaking the power grid in the process.

In a new study, the researchers propose to use embedded hardware that can be tailored to their specific application, i.e., weather simulation. By working closely with Tensilica, an embedded system designer firm, they calculated the cost and power requirements for such an application-specific supercomputer, and found that it could be done for an order of magnitude less cost and will require an order of magnitude less power than if they used general-purpose processors from AMD, Intel, or IBM.

The researchers estimate that a cluster based on AMD Opterons would require 1.78 million cores drawing a total of 179 megawatts and could be built for around $1.8 billion (not including interconnects); an IBM BlueGene/L type system would require 3.56 million cores drawing only 27 megawatts at the low, low price of $2.6 billion. In contrast, a Tensilica system with simple cores that are optimized for the task at hand would require 3.84 million cores drawing a mere 2.5 megawatts of power and could be had for a relative pittance ($75 million). Incidentally, 179 megawatts is enough to power a small city.

Name CPU Frequency

GHz Cores per socket Sockets

Millions Power

MW Cost

$ Millions AMD Opteron 2.8 2 0.89 179 1,800 BlueGene/L PPC440 0.7 2 1.78 27 2,600 Tensilica Custom 0.65 32 0.12 2.5 75

Can a system full of low-power embedded processors eat the big boys' lunch on this one? It certainly looks like it based on the above calculations. Of course the simplification above misses a few points, some of which the authors point out, some of which they don't. In particular, the Tensilica system would require a fair bit of software customization to achieve its goals. The authors point out that any system on the scale that they are talking about would be a huge challenge to program.

And of course the big boys aren't standing still, so while such a cost comparison might make sense today, what will it be like in the near future? The authors attempt to address this by looking at predicted future processor cost and power requirements. They suggest that by 2011, you could build a BlueGene/L or traditional cluster with comparable cost and power requirements of the Tensilica system they can build today. What the authors miss is something we have covered extensively at Ars, namely Larrabee and the general purpose GPU push. Indeed, Intel and Cray have already announced that they are collaborating, and NVIDIA and AMD are moving full steam ahead with supercomputer-bound, flexibly programmable GPUs of their own.

Further reading