It has taken about 2 years longer than we’d normally see, but the next full generation of GPUs are finally upon us. Powered by FinFET based nodes at TSMC and GlobalFoundries, both NVIDIA and AMD have released new GPUs with new architectures built on new manufacturing nodes. AMD and NVIDIA did an amazing job making the best of 28nm over the 4 year stretch, but now at long last true renewal is at hand for the discrete GPU market.

Back in May we took a first look at the first of these cards, NVIDIA’s GeForce GTX 1080 Founders Edition. Launched at $700, it was immediately the flagship for the FinFET generation. Now today, at long (long) last, we will be taking a complete, in-depth look at the GTX 1080 Founders Edition and its sibling the GTX 1070 Founders Edition. Architecture, overclocking, more architecture, new memory technologies, new features, and of course copious benchmarks. So let’s get started on this belated look at the latest generation of GPUs and video cards from NVIDIA.

NVIDIA GPU Specification Comparison GTX 1080 GTX 1070 GTX 980 GTX 970 CUDA Cores 2560 1920 2048 1664 Texture Units 160 120 128 104 ROPs 64 64 64 56 Core Clock 1607MHz 1506MHz 1126MHz 1050MHz Boost Clock 1733MHz 1683MHz 1216MHz 1178MHz Memory Clock 10Gbps GDDR5X 8Gbps GDDR5 7Gbps GDDR5 7Gbps GDDR5 Memory Bus Width 256-bit 256-bit 256-bit 256-bit VRAM 8GB 8GB 4GB 4GB FP64 1/32 1/32 1/32 1/32 TDP 180W 150W 165W 145W GPU GP104 GP104 GM204 GM204 Transistor Count 7.2B 7.2B 5.2B 5.2B Manufacturing Process TSMC 16nm TSMC 16nm TSMC 28nm TSMC 28nm Launch Date 05/27/2016 06/10/2016 09/18/14 09/18/14 Launch Price MSRP: $599

Founders $699 MSRP: $379

Founders $449 $549 $329

As a quick refresher, here are the specifications for the new cards. At a high level the Pascal architecture (as implemented in GP104) is a mix of old and new; it’s not a revolution, but it’s an important refinement. Maxwell as an architecture was very successful for NVIDIA both at the consumer level and the professional level, and for the consumer iterations of Pascal, NVIDIA has not made any radical changes. The basic throughput of the architecture has not changed – the ALUs, texture units, ROPs, and caches all perform similar to how they did in GM2xx.

Consequently the performance aspects of consumer Pascal – we’ll ignore GP100 for the moment – are pretty easy to understand. NVIDIA’s focus on this generation has been on pouring on the clockspeed to push total compute throughput to 8.9 TFLOPs, and updating their memory subsystem to feed the beast that is GP104.

GeForce GTX 1080

The GeForce GTX 1080 is a fully enabled implementation of GP104. This means 2560 CUDA cores split up over 20 SMs operating at a blistering boost clock of 1733MHz. NVIDIA is positioning GTX 1080 as a full generational update over GTX 980, and thanks to a combination of a slightly wider GPU and a much faster clockspeed, they can generally deliver on this. By the numbers, GTX 1080 offers 78% more raw compute, texturing, and geometry performance, and 43% more ROP throughput. Of course the latter is as much a product of memory bandwidth as it is the ROPs themselves, and for that NVIDIA has some new memory technologies.

Feeding the beast that is GTX 1080 is 8GB of GDDR5X. A new memory standard that extends the effective memory bandwidth of GDDR5, GTX 1080’s GDDR5X runs at 10Gbps, and is attached to a 256-bit memory bus. This gives GTX 1080 a full 320GB/sec of memory bandwidth to play with, 43% more than GTX 980. And as we’ll see in the coming architectural pages, these raw numbers don’t factor in the architectural improvements that allow the Pascal GPUs to stretch their memory bandwidth even further.

Finally, GTX 1080’s TDP is rated at 180W. This is a slight increase from the past generation, where GTX 980 required 165W. Video card specifications are of course a sliding scale – balancing desired performance with cooling capabilities and power consumption – and ultimately NVIDIA has opted to eat a slight increase in power consumption to allow GTX 1080 to deliver more performance than it otherwise would.

GeForce GTX 1070

Meanwhile below the GTX 1080 we have its lower price and lower performance sibling, the GTX 1070. The standard high-end salvage part, GTX 1070 trades off fewer functional blocks and the lower resulting performance in exchange for a significantly lower price than the GTX 1080. From a hardware perspective, the GTX 1070 utilizes GP104 with 1 of the 4 Graphics Processing Clusters (GPCs) disabled. Relative to GTX 1080, this knocks off around 25% of the shading/texturing/compute performance. However the memory controllers and ROP partitions remain untouched. With this configuration NVIDIA is pitching the GTX 1070 as a full generational update to the GTX 970, and with any luck, the GTX 1070 will be as well accepted as its extremely successful predecessor.

All told then, GTX 1070 provides 1920 CUDA cores split up over 15 SMMs. Those 15 SMMs are in turn running at a base clockspeed of 1506MHz and a boost clock of 1683MHz. This is slightly lower than GTX 1080, but as we’ll see in our full benchmark section, the official clockspeeds have a very little impact; it’s the disabled GPC that really makes the difference. By the numbers, relative to the GTX 970 the GTX 1070 offers 65% more shading, texturing, and geometry throughput, and 63% more ROP throughput. The latter coming as a courtesy of both the higher clockspeeds and the fact that GTX 1070 ships with all 64 ROPs enabled, versus 56 of 64 on GTX 970.

As for memory, GTX 1070 doesn’t get GDDR5X. Instead the card gets 8GB of GDDR5 running at 8Gbps. This delivers a total memory bandwidth of 256GB/sec, and again unlike GTX 970, there is nothing going on with partitions here, so all of that memory and all of that bandwidth is operating in one contiguous partition, giving the GTX 1070 an effective memory bandwidth increase of 31%. GTX 1070 is the first NVIDIA card to ship with 8Gbps GDDR5, a memory speed I once didn’t think possible. NVIDIA and the memory partners are pushing GDDR5 to the limit by doing this, but at this point in time this is the most economical way to boost memory bandwidth without resorting to more exotic and expensive solutions like GDDR5X.

GTX 1070 is rated for a 150W TDP; this is a smaller, 5W increase over its predecessor. Despite the official TDP, it should be noted that NVIDIA is not pitching this card as their 150W champion for systems with a single 6-pin PCIe power cable, and it will require a more powerful 8-pin cable. For systems that need a true sub-150W card, this is where the GTX 1060 will step in. Otherwise NVIDIA is making a very interesting power play here what is now the second most powerful video card on the market does so on just 150W.

Cards, Pricing, & Availability

For the GTX 1000 series, NVIDIA has undertaken a significant change in how they handle reference boards and how those boards are priced. What were once reference boards are now being released as the Founders Edition boards. These boards are largely similar to NVIDIA’s last-generation reference boards, built using a standard PCB and NVIDA’s high-end blower cooler, along with some additional cooling upgrades. The Founders Edition cards will, in turn, not be sold at NVIDIA’s general MSRP for each family, but rather they will be sold as premium cards for around $80-$100 more.

As a result we have two prices to talk about. For the GTX 1080, the family MSRP is $599. At the base level this is a slight price increase over the GTX 980, which launched at $549. As the Founders Edition cards are not being sold at this price, it is instead being filled by semi and fully custom cards from NVIDIA’s partners. These custom cards offer a mix of designs, but at the cheapest level (those cards closest to the MSRP) we’re predominantly looking at dual fan open air cooled cards. The rest of the lineup is filled by more advanced cards (including some closed loop liquid coolers) with factory overclocks and other features that are sold at a premium price. The GTX 1080 Founders Edition card, for its part, fits in to this picture at $699, a $100 premium.

GeForce GTX 1080 Configurations Base Founders Edition Core Clock 1607MHz 1607MHz Boost Clock 1733MHz 1733MHz Memory Clock 10Gbps GDDR5X 10Gbps GDDR5X Cooler Manufacturer Custom

(Typical: 2 or 3 Fan Open Air) NVIDIA Reference

(Blower w/Vapor Chamber) Price Starting at $599 $699

The story then is much the same for the GTX 1070. Its family MSRP is $379, which its Founders Edition counterpart is being sold for $449. At $379 for the family MSRP, this is a $40 price increase over the GTX 970, and I am curious over the long run whether this will significantly impact sales. One of the factors that made GTX 970 such a well-received card was its price, and this takes away from that by a bit. Otherwise, as with the GTX 1080, the partners’ custom cards for the GTX 1070 run the gamut from simple dual fan cards at the cheapest prices, up to premium, factory overclocked cards at the highest prices.

GeForce GTX 1070 Configurations Base Founders Edition Core Clock 1506MHz 1506MHz Boost Clock 1683MHz 1683MHz Memory Clock 8Gbps GDDR5 8Gbps GDDR5 Cooler Manufacturer Custom

(Typical: 2 or 3 Fan Open Air) NVIDIA Reference

(Blower w/Vapor Chamber) Price Starting at $379 $449

Unfortunately for everyone involved, the plan for pricing and reality haven’t quite agreed with each other. Even now, 2 months after the launch of the GTX 1080, card supplies are slim. There is effectively a shortage of GTX 1080 cards, as while NVIDIA insists they are continuing to ship out a good supply, those cards appear to be getting plucked off of virtual and physical shelves almost as quickly. As of the time this paragraph was written, Newegg only has a single GTX 1080 in stock, a Founders Edition card at $699.

For the last several generations it has been pretty common for the first batch or two of high-end cards to sell out, however to be sold out for 2 months is a lot less common. Other than NVIDIA’s Titan series card, which are a special case due to their prosumer market, I can’t immediately recall the last time an NVIDIA flagship card was this hard to get this late after a launch. For NVIDIA and its partners there are worse problems in the world – it’s better to have too few cards than too many cards that you can’t sell – but it certainly puts a damper on things for both the partners and for customers.

Meanwhile the GTX 1070 situation is noticeably better, though still not great. About half of the models that Newegg carries are in stock at any given time. So potential GTX 1070 owners have more options, though if they’re after a specific card they may find themselves waiting.

But the real problem with this shortage is that it has removed any incentive to keep prices close to NVIDIA’s MSRP. GTX 1070 prices start at $429 instead of $379, while GTX 1080 prices start at $649 (and if you actually want a card in stock, that’ll be $699). These are prices that are closer to last generations GTX 980 Ti/980 prices than they are 980/970, and it means that the actual GTX 1000 series price premium is much higher as it stands, at $100+ compared to the last generation. Given that these cards keep selling out, clearly there are enough buyers willing to pay these prices – it’s the free market in action – but it means NVIDIA’s MSRPs are for the moment an imaginary number. At this point all that we can do is hope that once the shortage breaks, there will be more intensive competition between the partners and retailers, and prices will fall down to MSRP.

As for the larger competitive landscape, as we’re looking at high-end cards at the start of a new generation, there really isn’t any competition to speak of. The GTX 1000 series sets a new bar for performance, and while last generation cards are being priced to clear out inventories, they aren’t performance competitive with the new cards. Meanwhile stalwart competitor AMD has opted to go after the mainstream market first rather than starting at the high-end. This means that the GTX 1080 and GTX 1070 will not have any competition for at least the next few months, leaving NVIDIA solely in the driver’s seat at the high-end, and in sole possession of the GPU performance crown.