NVIDIA GeForce GTX 1070 Founders Edition Review & Benchmark P2: GTX 1070 Thermal & Clockrate Tests P3: Dx12, OpenGL, & Vulkan GTX 1070 Benchmarks P4: Dx11 GTX 1070 FPS Benchmarks P5: Overclocking the GTX 1070 P6: Is the GTX 1070 Worth It?

The GTX 1080's epochal launch all but overshadowed its cut-down counterpart – that is, until the price was unveiled. NVidia's GTX 1070 is promised at an initial $450 price-point for the Founders Edition (explained here), or an MSRP of $380 for board partner models. The GTX 1070 replaces nVidia's GTX 970 in the vertical, but promises superior performance to previous high-end models like the 980 and 980 Ti; we'll validate those claims in our testing below, following an initial architecture overview. The GeForce GTX 1070 ($450) uses a Pascal GP104-200 chip. The architecture is identical to the GTX 1080 and its GP104-400 GPU, but cuts-down on SM presence (and core count) to create a mid-range version of the new 16nm FinFET architecture. This new node from TSMC is nearly half the size of Maxwell's 28nm Planar process, and switches the company over to FinFET transistor architecture for reduced power leakage and overall improved performance-per-watt efficiency. The trend is symptomatic of an industry trending toward ever-smaller devices with a greater concern on the power envelope, and has been reflected in nVidia's architectures since Fermi (GTX 400 series running notoriously hot) and AMD's since Fiji (sort of – Polaris claims to make a bigger push in this direction). On the CPU side, Intel has been driving this trend for several generations now, its 10nm process making promises to further extend mobile device endurance and transistor density.

Before getting started, here's a list of relevant Pascal & GTX 1080 articles we wrote:

Our review of the GTX 1070 Founders Edition graphics card will compare it vs. the GTX 1080, 980 Ti, 970 (and more) and AMD's R9 Fury X & R9 390X. We'll look at performance (FPS) benchmarks and our specialized thermal testing with endurance (clock-rate vs. time) burn-in. These tests explore major points of issue for new GPUs, and provide a closer look at real-world gaming performance. We're in Taipei, Taiwan right now for Computex, and so we've culled a few tests due to travel complications. Power draw did not make it into this review.

NVIDIA GeForce GTX 1070 vs. GTX 1080, 980 Ti, 970, & 390X [Video]

NVIDIA GeForce GTX 1070 Specs vs. 1080, 970, 980 Ti

NVIDIA Pascal vs. Maxwell Specs Comparison Tesla P100 GTX 1080 GTX 1070 GTX 980 Ti GTX 980 GTX 970 GPU GP100 Cut-Down Pascal GP104-400 Pascal GP104-200 Pascal GM200 Maxwell GM204 Maxwell GM204 Transistor Count 15.3B 7.2B 7.2B 8B 5.2B 5.2B Fab Process 16nm FinFET 16nm FinFET 16nm FinFET 28nm 28nm 28nm CUDA Cores 3584 2560 1920 2816 2048 1664 GPCs 6 4 3 6 4 4 SMs 56 20 15 22 16 13 TPCs 28 TPCs 20 TPCs 15 - - - TMUs 224 160 120 176 128 104 ROPs 96 (?) 64 64 96 64 56 Core Clock 1328MHz 1607MHz 1506MHz 1000MHz 1126MHz 1050MHz Boost Clock 1480MHz 1733MHz 1683MHz 1075MHz 1216MHz 1178MHz FP32 TFLOPs 10.6TFLOPs 9TFLOPs 6.5TFLOPs 5.63TFLOPs 5TFLOPs 3.9TFLOPs Memory Type HBM2 GDDR5X GDDR5 GDDR5 GDDR5 GDDR5 Memory Capacity 16GB 8GB 8GB 6GB 4GB 4GB Memory Clock ? 10Gbps GDDR5X 4006MHz 7Gbps GDDR5 7Gbps GDDR5 7Gbps Memory Interface 4096-bit 256-bit 256-bit 384-bit 256-bit 256-bit Memory Bandwidth ? 320.32GB/s 256GB/s 336GB/s 224GB/s 224GB/s TDP 300W 180W 150W 250W 165W 148W Power Connectors ? 1x 8-pin 1x 8-pin 1x 8-pin

1x 6-pin 2x 6-pin 2x 6-pin Release Date 4Q16-1Q17 5/27/2016 6/10/2016 6/01/2015 9/18/2014 9/19/2014 Release Price TBD

(Several thousand) Reference: $700

MSRP: $600 Reference: $450

MSRP: $380 $650 $550 $330

Architecture Revisit – GP104-200 Simultaneous Multiprocessors, GPCs, TPCs

The GTX 1070 utilizes the same Pascal GP104 architecture as found on the GTX 1080, though the *-200 subversion (rather than *-400) does bring some changes. Those changes are mostly to core count and clock speed.

The silicon is the same, the architecture is mostly the same, but the die has been somewhat simplified on the GTX 1070 to reduce cost. The heart of the chip is still 16nm FinFET design, which operates at slightly lower voltage than planar process and exhibits less power leakage than planar. Datapath optimizations are also in-place for performance improvements, something we spent a few thousand words on in our 1080 review.

(Above: The GP104-400 block diagram. Remove one GPC -- that's basically GP104-200.)

(Above: SM architecture on Pascal / GP104.)

NVidia's GTX 1070 runs 15 SMs rather than the 20 SMs of GP104-400, reducing core count to 1920 and TMUs to 120 (capable of 202GT/s). The clock boosts to 1683MHz, but has OC headroom that we play with later in this review.

Another major change from the 1080 is the GTX 1070's usage of GDDR5 8Gbps memory, rather than the new GDDR5X 10Gbps memory of the GTX 1080. This reduces cost of the card by mounting a more ubiquitous memory platform to the device.

To learn about asynchronous compute and memory subsystems on Pascal, check out our 9000-word review on the GTX 1080.

Game Test Methodology

We tested using our GPU test bench, detailed in the table below. Our thanks to supporting hardware vendors for supplying some of the test components.

The latest AMD drivers (16.15.2.1 Doom-ready) were used for testing. NVidia's unreleased press drivers were used for game (FPS) testing. Game settings were manually controlled for the DUT. All games were run at presets defined in their respective charts. We disable brand-supported technologies in games, like The Witcher 3's HairWorks and HBAO. All other game settings are defined in respective game benchmarks, which we publish separately from GPU reviews. Our test courses, in the event manual testing is executed, are also uploaded within that content. This allows others to replicate our results by studying our bench courses.

Windows 10-64 build 10586 was used for testing.

Each game was tested for 30 seconds in an identical scenario, then repeated three times for parity. Some games have multiple settings or APIs under test, leaving our test matrix to look something like this:

Ashes Talos Tomb Raider Division GTA V MLL Mordor BLOPS3 Thermal Power Noise NVIDIA CARDS GTX 1080 4K Crazy

4K High

1080 High Dx12 & Dx11 4K Ultra

1440p Ultra

1080p UltraVulkan & Dx11 4K VH

1440p VH

1080p VHDx12 & Dx11 4K High

1440p High

1080p High 4K VHU

1080 VHU 4K HH

1440p VHH

1080p VHH 4K Ultra

1440p Ultra

1080p Ultra 4K High

1440p High

1080p High Yes Yes Yes GTX 980 Ti 4K Crazy

4K High

1080 High Dx12 & Dx11 4K Ultra

1440p Ultra

1080p UltraVulkan & Dx11 4K VH

1440p VH

1080p VHDx12 & Dx11 4K High

1440p High

1080p High 4K VHU

1080 VHU 4K HH

1440p VHH

1080p VHH 4K Ultra

1440p Ultra

1080p Ultra 4K High

1440p High

1080p High Yes Yes Yes GTX 980 4K Crazy

4K High

1080 High Dx12 & Dx11 4K Ultra

1440p Ultra

1080p UltraVulkan & Dx11 4K VH

1440p VH

1080p VHDx12 & Dx11 4K High

1440p High

1080p High 4K VHU

1080 VHU 4K HH

1440p VHH

1080p VHH 4K Ultra

1440p Ultra

1080p Ultra 4K High

1440p High

1080p High Yes Yes Yes AMD CARDS AMD R9 390X 4K Crazy

4K High

1080 High Dx12 & Dx11 4K Ultra

1440p Ultra

1080p UltraVulkan & Dx11 4K VH

1440p VH

1080p VHDx12 & Dx11 4K High

1440p High

1080p High 4K VHU

1080 VHU 4K HH

1440p VHH

1080p VHH 4K Ultra

1440p Ultra

1080p Ultra 4K High

1440p High

1080p High Yes Yes No AMD Fury X 4K Crazy

4K High

1080 High Dx12 & Dx11 4K Ultra

1440p Ultra

1080p UltraVulkan & Dx11 4K VH

1440p VH

1080p VHDx12 & Dx11 4K High

1440p High

1080p High 4K VHU

1080 VHU 4K HH

1440p VHH

1080p VHH 4K Ultra

1440p Ultra

1080p Ultra 4K High

1440p High

1080p High Yes Yes Yes

Average FPS, 1% low, and 0.1% low times are measured. We do not measure maximum or minimum FPS results as we consider these numbers to be pure outliers. Instead, we take an average of the lowest 1% of results (1% low) to show real-world, noticeable dips; we then take an average of the lowest 0.1% of results for severe spikes.

For Dx12 and Vulkan API testing, we use built-in benchmark tools and rely upon log generation for our metrics. That data is reported at the engine level.

Video Cards Tested

Thermal Test Methodology

We strongly believe that our thermal testing methodology is the best on this side of the tech-media industry. We've validated our testing methodology with thermal chambers and have proven near-perfect accuracy of results.

Conducting thermal tests requires careful measurement of temperatures in the surrounding environment. We control for ambient by constantly measuring temperatures with K-Type thermocouples and infrared readers. We then produce charts using a Delta T(emperature) over Ambient value. This value subtracts the thermo-logged ambient value from the measured diode temperatures, producing a delta report of thermals. AIDA64 is used for logging thermals of silicon components, including the GPU diode. We additionally log core utilization and frequencies to ensure all components are firing as expected. Voltage levels are measured in addition to fan speeds, frequencies, and thermals. GPU-Z is deployed for redundancy and validation against AIDA64.

All open bench fans are configured to their maximum speed and connected straight to the PSU. This ensures minimal variance when testing, as automatically controlled fan speeds will reduce reliability of benchmarking. The CPU fan is set to use a custom fan curve that was devised in-house after a series of testing. We use a custom-built open air bench that mounts the CPU radiator out of the way of the airflow channels influencing the GPU, so the CPU heat is dumped where it will have no measurable impact on GPU temperatures.

We use an AMPROBE multi-diode thermocouple reader to log ambient actively. This ambient measurement is used to monitor fluctuations and is subtracted from absolute GPU diode readings to produce a delta value. For these tests, we configured the thermocouple reader's logging interval to 1s, matching the logging interval of GPU-Z and AIDA64. Data is calculated using a custom, in-house spreadsheet and software solution.

Endurance tests are conducted for new architectures or devices of particular interest, like the GTX 1080, R9 Fury X, or GTX 980 Ti Hybrid from EVGA. These endurance tests report temperature versus frequency (sometimes versus FPS), providing a look at how cards interact in real-world gaming scenarios over extended periods of time. Because benchmarks do not inherently burn-in a card for a reasonable play period, we use this test method as a net to isolate and discover issues of thermal throttling or frequency tolerance to temperature.

Our test starts with a two-minute idle period to gauge non-gaming performance. A script automatically triggers the beginning of a GPU-intensive benchmark running MSI Kombustor – Titan Lakes for 1080s. Because we use an in-house script, we are able to perfectly execute and align our tests between passes.