AMD announced today that it has started revenue shipments of processors based on its new Bulldozer architecture. The first shipments of the 16-thread, 8-module processors, codenamed Interlagos, are headed for supercomputers, and will be in 2- and 4-socket servers by the end of the month. The chips will be branded as the Opteron 6200 series.

Further details were in short supply; the company did not confirm clock speeds, prices, or performance of the Opteron 6200 parts. Specifications leaked by Gateway point at a 3GHz top speed (with turbo boosting to 3.5GHz) for 8-thread parts, dropping to a 2.3GHz/2.8GHz turbo speed for 16-thread parts.

AMD also neglected to provide release dates for the Opteron 4200 parts, codenamed Valencia, which will come in 6- and 8-thread versions. These are currently expected to arrive on September 26. Even less is know about when the first desktop-oriented Bulldozer processors will arrive, amid speculation that the FX-series processors, codenamed Zambezi, have slipped into the fourth quarter.

Bulldozer is the company's first substantially new microarchitecture since the 2003 introduction of the K8 core (which spawned the K10 core released in 2007). Bulldozer uses a modular approach to processor design, and comparing the modules to traditional cores is tricky. Each module can run two threads simultaneously, and each module has two integer pipelines and level 1 cache, one for each thread. However, the floating point pipeline is shared between the two threads, as is the front-end instruction decoder and the level 2 cache. Each module is two cores in some places, but only one core in others.

For integer-heavy workloads, the module should be essentially equivalent to two full-blown cores. Compared to Intel's hyperthreading, which runs two threads on a single core, Bulldozer will offer far more dedicated execution resources to the two threads. However, the story may be different for floating point workloads, where the threads will have to compete for execution resources. While the floating point unit is more capable than that found in K10, with more execution resources and support for new instructions, it may not prove as effective as two discrete floating point cores.

This design allows AMD to keep each module relatively small, allowing it to pack more modules onto a chip. It's also designed to be power efficient; sharing resources instead of duplicating them allows AMD to do more with fewer transistors.

AMD has been reeling since Intel introduced the Core 2 architecture in 2006. Core 2 eclipsed the performance of the K8, and though K10 improved on K8, it still trailed behind Intel's offerings. Intel's Nehalem and Sandy Bridge architectures have just widened the gap. Bulldozer represents a substantial gamble for AMD. The design is not even attempting to match Intel's single-threaded performance. This is a design tailored to highly multithreaded integer-heavy—server—workloads: a market more lucrative than the desktop market, but where Intel currently outsells AMD by about 19 to 1. In 2006, the ratio was about 3 to 1 in Intel's favor. If AMD's gamble pays off, it will be able to turn the tide and expand its share. If it doesn't, it's difficult to see the company ever becoming a major player in the server market ever again.