As unfathomable as it seems now, at one point in history the product refresh cycle for GPUs was around 6 months. Twice a year NVIDIA and AMD would come out with major refreshes to their product lines, particularly at the high-end where a quick succession of parts brought great performance gains and left us little time to breathe.

Since then things have changed a great deal. GPU complexity has grown by leaps and bounds – although by the time the term “GPU” was even coined GPUs ceased being simple devices, they were still fairly small chips put together at a block level by a relatively small team of engineers. The modern GPU on the other hand is a large, complex entity. Although the development cycle for a GPU is still shorter than the 4+ years for a CPU, GPU complexity has approached the CPU in some ways and exceeded it in others. Meanwhile in terms of die size even midrange GPUs like GF106 (GTS 450) are as big as modern CPUs like Sandy Bridge, never mind high-end GPUs like GF110. As a result the refresh cycle for GPUs has become progressively longer by relying primarily on die shrinks, and in modern times we’re looking at close to a year between refreshes.

The reason I bring this up is because NVIDIA has found itself in an interesting position with the Fermi architecture. We’ve covered the problems NVIDIA had in the past, particularly with the first Fermi – GF100. NVIDIA since corrected GF100’s biggest production flaws in GF110, giving us the Fermi we originally expected nearly half a year earlier. NVIDIA is now in the process of cascading those production improvements down the rest of the Fermi line, churning out the fully-enabled Fermi GPUs that we did not get to see in 2010. Whether it’s intentional or not – and we believe it’s not – NVIDIA has fallen back in to the 6 month cycle.

Late last year we saw GF110, the first of the revised Fermi family. GF110 brought with it GTX 580 and GTX 570, a pair of powerful if expensive video cards that put NVIDIA back where they traditionally lie on the performance/power curve. Now it’s time for GF104 to get the same treatment. Its revised counterpart is the aptly named GF114, and it is the heart of NVIDIA’s newest video card: the GeForce GTX 560 Ti.

GTX 580 GTX 570 GTX 560 Ti GTX 460 1GB Stream Processors 512 480 384 336 Texture Address / Filtering 64/64 60/60 64/64 56/56 ROPs 48 40 32 32 Core Clock 772MHz 732MHz 822MHz 675MHz Shader Clock 1544MHz 1464MHz 1644MHz 1350MHz Memory Clock 1002MHz (4008MHz data rate) GDDR5 950MHz (3800MHz data rate) GDDR5 1002Mhz (4008MHz data rate) GDDR5 900Mhz (3.6GHz data rate) GDDR5 Memory Bus Width 384-bit 320-bit 256-bit 256-bit Frame Buffer 1.5GB 1.25GB 1GB 1GB FP64 1/8 FP32 1/8 FP32 1/12 FP32 1/12 FP32 Transistor Count 3B 3B 1.95B 1.95B Manufacturing Process TSMC 40nm TSMC 40nm TSMC 40nm TSMC 40nm Price Point $499 $349 $249 ~$160

GTX 560 Ti, in a nutshell, is a complete video card using the GF104 design; it is to GTX 460 what GTX 580 was to GTX 480. With the GTX 460 we saw NVIDIA disable some functional units and limit the clockspeeds, but for GTX 560 Ti they’re going all out. Every functional unit is enabled, and clockspeeds are much higher, with a core clock of 822MHz being what we believe is much closer to the original design specifications of GF104. Even though GF114 is identical to GF104 in architecture and the number of functional units, as we’re going to see the resulting video cards are quite different – GTX 560 Ti is quite a bit faster than GTX 460 most of the time.

NVIDIA GF114 - Full Implementation, No Disabled Logic



So how is NVIDIA accomplishing this? Much like what GF110 did for GF100, GF114 is doing for GF104. NVIDIA has resorted to tinkering with the Fermi family at a low level to optimize their designs against TSMC’s mature 40nm process, paying much closer attention to the types of transistors used in order to minimize leakage. As a result of the more mature manufacturing process and NVIDIA’s optimizations, they are now able to enable previously disabled functional units and raise clock speeds while keeping these revised GPUs in the same power envelopes as their first-generation predecessors. This is allowing NVIDIA to improve performance and/or power consumption even though these revised chips are virtually identical to their predecessors.

On GF110, we saw NVIDIA choose to take moderate gains in both performance and power consumption. In the case of GF114/GTX 560 however, NVIDIA is choosing to focus on improving performance while leaving power consumption largely unchanged – GTX 460 after all was a well-balanced part in the first place, so why change what already works?

In order to achieve the larger performance jump they’re shooting for, NVIDIA is tackling this from two sides. First of course is the enabling of previously disabled functional units – GTX 460 1GB had all 32 of its ROPs and associated hardware enabled, but only 7 of its 8 SMs enabled, leaving its geometry/shading/texturing power slightly crippled from what the GF104 chip was fully capable of. Like GF110/GTX 580, GF114/GTX 560 Ti will be a fully enabled part: all 384 CUDA Cores, 64 texture units, 8 Polymorph Engines, 32 ROPs, 512KB L2 cache, 4x64bit memory controllers are present, accounted for, and functional. Thus compared to GTX 460 1GB in particular, GTX 560 Ti immediately has more shading, texturing, and geometry performance than its predecessor, with roughly a 14% advantage over a similarly clocked GTX 460 1GB.

The other aspect of improving performance is improving the clockspeed. As you may recall GTX 460 was quite the charming overclocking card, as even without GPU overvolting we could routinely get 20% or more over the stock clock speed of 675MHz; to the point where NVIDIA tried to make an unofficial product out of partner cards with these lofty overclocks. For GTX 560 Ti NVIDIA has rolled these clocks in to the product, with GTX 560 Ti shipping at an 822MHz core clock and 1002MHz (4008MHz data rate) memory clock. This represents a 147Mhz (22%) core clock increase, and a more mild 102MHz (11%) memory clock increase over the GTX 460 1GB. Coupled with the aforementioned 14% increase in SMs, and it’s clear that there’s a quite a potential performance improvement for the GTX 560 even though we’re still technically looking at the same GPU.

As NVIDIA is not looking to significantly move the power envelope on the GTX 560 Ti compared to the GTX 460 1GB, the TDP remains similar. NVIDIA never specifies an idle TDP, but with their transistor level changes it should be lower. Meanwhile load TDP is going up by 10W, from 160W on the GTX 460 1GB to 170W on the GTX 560 Ti. 10W shouldn’t make for a significant difference, but it does drive home the point that NVIDIA is focusing more on performance at the slight expense of power this time around. GF114 is pin compatible with GF104, so partners can drop it in to existing GTX 460 designs, but those designs will need to be able to handle the extra power draw and heat. NVIDIA’s own reference design has been bulked up some, as we’ll see when we dissect the card.

The GTX 560 Ti will be launching at $249, roughly $20 higher than where the GTX 460 1GB started out but still targeted towards the same 1920x1200/1920x1080 resolution user base. Furthermore NVIDIA’s product stack will be shifting in response to the GTX 560 Ti. GTX 460 1GB is officially being moved down to make room for the GTX 560 Ti, and while NVIDIA isn’t providing MSRPs for it, the GTX 460 1GB can be found for as little as $150 after rebates right now – though this is largely a consequence of pricing wars with the AMD 6800 series rather than NVIDIA’s doing. Filling this nearly $100 gap for now will be factory overclocked GTX 460 1GBs. Meanwhile between the GTX 560 and GTX 570 will be a number of factory overclocked GTX 560s launching on day 1 (reusing GTX 460 designs). The GTX 470 is still on the market (and at prices below the GTX 560 for obvious reasons), but it’s not an official part of the stack and we expect supplies to dry up in due time.

NVIDIA’s marketing focus for the GTX 560 is to pair it with Intel’s recently launched Sandy Bridge CPUs, which have inspired a wave of computer upgrades that NVIDIA would like to hitch a ride with. Compared to the GTX 460 the GTX 560 isn’t a major upgrade on its own, and as a result NVIDIA is focusing more towards people upgrading their 8000/9000/GTX200 series equipped computers. Ultimately if you’re upgrading, NVIDIA would love to sell you a $250 GPU alongside a cheaper Core i5 2500K processor.

Meanwhile over at AMD they are shuffling their lineup and launching their own two-front counter-offensive. In terms of pricing and performance the GTX 560 Ti is between the Radeon HD 6950 and Radeon HD 6870, leaving AMD with a hole to fill. AMD has chosen to launch 1 new product – the Radeon HD 6950 1GB – to sit right above the GTX 560 Ti at $259, and in a move similar to how NVIDIA handled the Radeon HD 6800 series launch, push factory overclocked Radeon HD 6870s to go right below the GTX 560 Ti at around $230. The net result is that the price of reference-clocked 6870s has come down nearly $30 from launch, and can now be found for as little as $200. In any case, as there’s a great deal to discuss here, please see our companion article for the full-rundown on AMD’s GTX 560 Ti counter-offensive.