The latest Top 500 supercomputers list is out, and it represents yet another step in a trend that is changing the complexion of this list from year to year. No, I'm not talking about the way that the x86 architecture has taken over the list like kudzu in the past decade, squeezing out classic, special-purpose supercomputer architectures with a combination of high performance and low costs. Intel and AMD have owned this list for some time, and at this point they can claim a presence in close to 90 percent of the systems on it.

The really new trend—the one that just recently began and that will only accelerate—is the increasing presence of game-oriented hardware on the list. At the top of the list is IBM's Roadrunner, which first rode the power of the Cell chip to the top spot in June. Roadrunner epitomizes both trends in that it combines a modified version of the processor used in the PlayStation 3 with AMD's Opteron. But the Cell isn't the only coprocessor on the list that has its roots in gaming.

An NVIDIA GPU has finally made its way onto the Top 500 list, in a 170 TFLOP machine based at the Tokyo Institute of Technology. The TSUBAME was upgraded recently with NVIDIA's Tesla S1070, a math coprocessor that's essentially a specialized version of the same GPU that the company sells to gamers. Like its gaming sibling, the Telsa is programmable with CUDA, and when paired with a general-purpose processor, it makes for a great, data-parallel, floating-point machine.

The high-performance computing (HPC) market is one that NVIDIA, IBM, and AMD/ATI all see as a growth market for repurposed gaming hardware. Supercomputers typically run workloads with high degrees of data parallelism, and they have an insatiable appetite for bandwidth and floating-point power. These characteristics are also typical of games, and for good reason—both gaming and supercomputing workloads are essentially simulations.

As I described in this article from March of 2007 (unfortunately it has "PhysX" in the title, so it often gets skipped over), semiconductor manufacturers who want to target the high-margin HPC market must first build up volume with a version of their chip in the larger consumer market. When a chip ships in enough volume to keep production costs down, the vendor can then jack up the margins on a special version of the product and sell it to HPC customers in academia, government, and industry.

Coprocessors that are specifically designed with HPC in mind are ultimately doomed, since even a fabless vendor can't get high enough volumes in the HPC market alone to justify the astronomically high upfront production costs of developing a modern chip and getting through the point of having an actual mask set made. (I expect to get a flood of e-mail from stealth mode HPC-only math coprocessor plays offering to explain to me why I'm wrong. By all means, fire away, because even if you don't change my mind, I'm likely to learn something.)