GK110 Gets A Little Bit Leaner

I get a kick out of looking back at what I wrote about certain pieces of high-end hardware. When Nvidia launched the GeForce GTX 680, AMD was still asking something like $550 for the Radeon HD 7970, and the GK104-based board kicked it right in the tail. It was faster, cooler, quieter, and smaller than AMD’s flagship. I recommended the 680 without hesitation. And until Nvidia launched its almost-as-fast GeForce GTX 670 for even less, the 680 was a great choice.

Today, 7970s are down to $400 or so. Meanwhile, the GTX 680s are selling for roughly $460. What a reversal, right? After some serious driver work, AMD’s Radeon HD 7970 is notably faster than the GeForce board and it costs less. You have to put up with more noise and higher power use, but the Tahiti-based card also gives you great compute horsepower to match its gaming alacrity. So long as you stay away from multi-GPU configurations, the Radeon HD 7970 is a smart buy.

The next step up is going to cost you a cool grand. Be it the GeForce GTX Titan, the GeForce GTX 690, or AMD’s Radeon HD 7990, single-card performance doesn’t get any better than a Radeon HD 7970 unless you spend twice as much. And that’s where AMD and Nvidia lose a lot of gamers otherwise down to drop big bucks on graphics. Stepping up from $500 to $1,000 is rough.

GK110 Finds A New Home In GeForce GTX 780

GeForce GTX 780 is Nvidia’s attempt to do a little something about that gaping maw of a price delta between GTX 680 and the crazy-expensive stuff. Given its name, you might think the 780 centers on a new piece of silicon. But it’s really a derivative of GeForce GTX Titan and the gargantuan GK110 GPU.

Of course, the GK110 that Nvidia uses on GeForce GTX 780 is necessarily trimmed to keep it from showing up the potent Titan. We already know that a complete GK110 GPU plays host to 15 Streaming Multiprocessors, each with 192 CUDA cores and 16 texture units. GeForce GTX Titan pares the chip back to 14 SMXes, totaling 2,688 CUDA cores and 224 texture units. GeForce GTX 780 sees GK110 further cut down to 12 SMXes. The result is 2,304 CUDA cores and 192 texture units.

GK110 as it appears in GeForce GTX 780

Depending on the card you get, GeForce GTX 780’s 12 SMX blocks are either spread between four or five Graphics Processing Clusters. Composed of 7.1 billion transistors, GK110 is a massive chip. Manufacturing it isn’t easy. And along the way, different parts of it show up with defects. So, Nvidia can’t guarantee the exact configuration of each GeForce GTX 780’s GK110. It’ll only say that, across the GPU, 12 SMXes are enabled.

Nvidia’s incisions are effective enough in dictating performance that tweaks to the board’s clock rates are very minor. GeForce GTX 780 bears an 863 MHz base frequency, just like Titan. But its rated GPU Boost clock rate is 900 MHz, whereas Titan is officially spec’ed at 876 MHz.

GK110 retains its complete render back-end, including six ROP partitions able to output eight 32-bit pixels per clock, adding up to what the company calls 48 ROP units. Further, a sextet of 64-bit memory interfaces yield the same 384-bit aggregate pathway. But whereas Nvidia armed GeForce GTX Titan with 6 GB of GDDR5 memory, GTX 780 sports 3 GB operating at 1,502 MHz. Do the math and you get the same 288.4 GB/s of peak bandwidth.

Where GeForce GTX 780 veers away from Titan is in compute potential. You’ll remember from Nvidia GeForce GTX Titan 6 GB: GK110 On A Gaming Card that Nvidia’s single-GPU flagship includes a special driver setting that scales back clock rate in favor of running the chip’s double-precision floating-point units at full-speed. This makes the GeForce GTX Titan a viable option for developers seeking more compute performance than Nvidia’s other GPUs can muster (making it competitive with AMD’s Tahiti, in fact). This time around, you still get 64 FP64 CUDA cores per SMX. But because that driver setting isn’t exposed, double-precision performance drops back to 1/24 of the FP32 rate. Expect floating-point performance to trail Radeon HD 7970 then, as FP64 throughput lags.

GeForce GTX Titan GeForce GTX 690 GeForce GTX 780 GeForce GTX 680 Radeon HD 7970 GHz Ed. Shaders 2,688 2 x 1,536 2,304 1,536 2,048 Texture Units 224 2 x 128 192 128 128 Full Color ROPs 48 2 x 32 48 32 32 Graphics Clock 836 MHz 915 MHz 863 MHz 1,006 MHz 1,000 MHz Texture Fillrate 187.5 Gtex/s 2 x 117.1 Gtex/s 165.7 Gtex/s 128.8 Gtex/s 134.4 Gtex/s Memory Clock 1,502 MHZ 1,502 MHz 1,502 MHz 1,502 MHz 1,500 MHz Memory Bus 384-bit 2 x 256-bit 384-bit 256-bit 384-bit Memory Bandwidth 288.4 GB/s 2 x 192.3 GB/s 288.4 GB/s 192.3 GB/s 288 GB/s Graphics RAM 6 GB GDDR5 2 x 2 GB GDDR5 3 GB GDDR5 2 GB GDDR5 3 GB GDDR5 Die Size 551 mm2 2 x 294 mm2 551 mm2 294 mm2 365 mm2 Transistors (Billion) 7.1 2 x 3.54 7.1 3.54 4.31 Process Technology 28 nm 28 nm 28 nm 28 nm 28 nm Power Connectors 1 x 8-pin, 1 x 6-pin 2 x 8-pin 1 x 8-pin, 1 x 6-pin 2 x 6-pin 1 x 8-pin, 1 x 6-pin Maximum Power 250 W 300 W 250 W 195 W 250 W Price (Street) $1,000 $1,000 $650 $460 $450

Playing The Name Game

Perhaps you’re asking: Why call this card GeForce GTX 780 at all, then? It’s a derivative of Titan, based on the same Kepler architecture already prolific across the GeForce GTX 600 series. Nvidia did much the same thing with its 500 series, which built on the GeForce GTX 400’s Fermi architecture. “But the 500s were based on redesigned GPUs that improved performance, power, and, consequently, efficiency,” you rightly point out.

The most I could get from Nvidia was that it didn’t need to do this for the GeForce GTX 780, since the company only just started releasing desktop-oriented cards based on GK110. And there’s really not much room left in the 600 family to release newer, faster products. We’ll see how Nvidia fleshes out the performance and pricing of its GeForce GTX 700 line-up from here. But you can bet we’re going to expect notable performance improvements each step of the way to justify the naming. We also need continued (and compelling) competition from AMD to keep Nvidia’s pricing in check. Our best hope for that today is Radeon HD 7970.