In general, when ExtremeTech or another technology website discusses supercomputers, it’s always in terms of speed; teraflops, petaflops, the rush to exascale, and so on. There’s a reason for this, of course: After decades of indoctrination by Intel, processing speed is something that nearly all of us can relate to.

What’s rarely discussed, however, is the purpose of supercomputers. Really, the world’s fastest supercomputer recently hit a peak of 10 petaflops — 10 quadrillion calculations per second, or around 200,000 times faster than your Core i7 Sandy Bridge computer — but what does it do with all of that power?

To answer that question, we first need to look at the architecture of supercomputers.

Architecture

Supercomputers are, for the most part, nothing like your PC. In some cases supercomputers are built with x86 CPUs from AMD and Intel, or GPUs from Nvidia or AMD, but that’s where the similarity ends.

Supercomputers normally make use of customized compute units (called blades) which usually house multiple nodes (CPUs, GPUs). In the case of the Cray XK6, the most powerful blade in the world, each blade contains four nodes, and each node houses a 16-core AMD Opteron CPU and Nvidia Tesla GPU, and 16 or 32GB of RAM. These nodes are connected together with a proprietary interconnect (usually optical). Multiple blades are then stacked together in racks (again, optically networked), allowing for tens of thousands of nodes to be crammed into a large room.

A little known fact: Supercomputers are water-cooled, which not only saves money (hot CPUs leak more power) but also allows them to run faster. In the case of K, the world’s fastest supercomputer, there are some 88,128 compute nodes — 88,128 8-core SPARC64 VIIIfx CPUs — and each one is connected up to what is probably the most complex water cooling system in the world.

Finally, you need some seriously awesome software to control the beast, which nowadays is nearly always a customized version of Linux. Each supercomputer manufacturer (IBM, Cray, Fujitsu) usually starts with a Linux distro of choice, and then makes significant changes to tailor the OS to the specific hardware. It is the operating system’s job to minimize the time each node spends waiting for new data to arrive. This is involves a very intricate job of task scheduling and memory management. Don’t forget, a supercomputer generally has thousands of gigabytes of RAM, and sometimes hard drive storage in the petabyte range.

The end result is a supercomputer that has tens of thousands of nodes that (hopefully) act in parallel. If you imagine a single, futuristic Intel CPU that has ten thousand cores in the same package, that’s the idealistic end-goal that supercomputer designers are aiming for.

The cost of ownership, by the way, is in the hundreds of millions range. Not only do you have the installation cost, but a supercomputer uses megawatts of power, too. Jaguar, the third fastest supercomputer in the world (pictured above), cost $104 million to install and uses 7 megawatts — and every megawatt, at $0.10/kWh, equates to roughly $1 million per year. Add in the dozen or so technicians that you need to keep a supercomputer running, and you’re looking at an annual cost of $10 million or more.

Next page: But what the heck can you do with 88,128 parallel processors?