NVIDIA GTX 1060 Review & Benchmark vs. RX 480 (Ft. MSI Gaming X) P2: GPU Test Methodology P3: Thermals, Power, Noise, & Throttles P4: Dx12, Vulkan vs. Dx11 & OpenGL Gaming on GTX 1060 P5: Dx11 – GTA V, Mirror's Edge, Black Ops, More P6: Overclocking the GTX 1060 & Gaming X P7: Conclusion: Is the GTX 1060 Worth It?

Our thermal benchmarking has expanded to the point that the tests form our most comprehensive section of any review. For this content, we dig deep into endurance testing with nVidia's just-launched GeForce GTX 1060 Founders Edition card, comparing it to the MSI GTX 1060 Gaming X. The validation testing yields interesting results, particularly with regard to potential throttle points and dips in clock-rate. More on that in a bit. Today marks the launch of the GTX 1060 ($250-$300), announced about ten days ago. The GTX 1060 fills the mid-range of the market as a 6GB solution on the 16nm FinFET process node debuted in Pascal, and that's done with GP106. Our GTX 1060 Founders Edition & MSI 1060 Gaming X review looks at FPS (particularly vs. the 1070 and RX 480), Vulkan & Dx12 performance, thermals, noise, power, and overclocking results.

NVIDIA GeForce GTX 1060 Specs vs. GTX 1070, GTX 1080, GTX 960

NVIDIA Pascal vs. Maxwell Specs Comparison GTX 1080 GTX 1070 GTX 1060 GTX 980 Ti GTX 980 GTX 960 GPU GP104-400 Pascal GP104-200 Pascal GP106 Pascal GM200 Maxwell GM204 Maxwell GM204 Transistor Count 7.2B 7.2B 4.4B 8B 5.2B 2.94B Fab Process 16nm FinFET 16nm FinFET 16nm FinFET 28nm 28nm 28nm CUDA Cores 2560 1920 1280 2816 2048 1024 GPCs 4 3 2 6 4 2 SMs 20 15 10 22 16 8 TPCs 20 15 10 - - - TMUs 160 120 80 176 128 64 ROPs 64 64 48 96 64 32 Core Clock 1607MHz 1506MHz 1506MHz 1000MHz 1126MHz 1126MHz Boost Clock 1733MHz 1683MHz 1708MHz 1075MHz 1216MHz 1178MHz FP32 TFLOPs 9TFLOPs 6.5TFLOPs 3.85TFLOPs 5.63TFLOPs 5TFLOPs 2.4TFLOPs Memory Type GDDR5X GDDR5 GDDR5 GDDR5 GDDR5 GDDR5 Memory Capacity 8GB 8GB 6GB 6GB 4GB 2GB, 4GB Memory Clock 10Gbps GDDR5X 4006MHz 8Gbps 7Gbps GDDR5 7Gbps GDDR5 7Gbps Memory Interface 256-bit 256-bit 192-bit 384-bit 256-bit 128-bit Memory Bandwidth 320.32GB/s 256GB/s 192GB/s 336GB/s 224GB/s 115GB/s TDP 180W 150W 120W 250W 165W 120W Power Connectors 1x 8-pin 1x 8-pin 1x 6-pin 1x 8-pin

1x 6-pin 2x 6-pin 1x 6-pin Release Date 5/27/2016 6/10/2016 7/19/2016 6/01/2015 9/18/2014 01/22/15 Release Price Reference: $700

MSRP: $600 Reference: $450

MSRP: $380 Reference: $300

MSRP: $250 $650 $550 $200

Known GTX 1060 Models & Prices

Below is a list of known vendors, card models, and prices. We are also aware that a few other vendors, like Colorful, will be shipping models shortly.

Pascal thus far has struggled to maintain both availability and AIB partner prices within the suggested range. As we're writing this review prior to the GTX 1060's public launch, we are yet unsure if the above product listing will be immediately available at listed prices. This launch is supposed to be accompanied with immediate availability, though.

Previous Pascal Architecture & Review Content

Architecture Revisit (Again) – GP106 Simultaneous Multiprocessors, GPCs, TPCs

Because this is now the fourth Pascal chip that we've written about, we won't be going as deep on the architecture as in previous content. For the P100 Accelerator (and the introduction of Pascal), check this deep-dive with several pages of architectural exploration. To catch up on Pascal as it pertains to gaming cards (GeForce GTX devices), view the first page of our GTX 1080 Founders Edition review.

The GTX 1060 uses a new graphics processor from what we saw with the GTX 1080 and 1070. NVidia's 1060 deploys GP106, a slimmed-down Pascal variant that follows the 1070's GP104-200 and 1080's GP104-400. The block diagram tells most the story on its own. Starting with the GTX 1080 block diagram:

The GTX 1080 GP104-400 GPU runs four GPCs and 20 SMs, equating 2560 CUDA cores. Similar to Maxwell's big GM204 chip, GP104 partitions its instruction cache into two effective shared caches, each owning a dedicated instruction buffer, a dedicated warp scheduler, and two dedicated dispatch units (also, as a note, a dedicated register file of 16,384 x 32-bit per SM). PolyMorph Engine 4.0 sits on top of everything and can be canonically visualized as resting between the raster engine and the SM.

Let's put that into perspective with the biggest Pascal chip presently known, the GP100:

GP100 can host up to 60 SMs and 3584 FP32 CUDA cores, with a 2:1 ratio of FP32:FP64. GP100 has six GPC divisions, ten SMs, with every pair of SMs sharing a single TPC. The Tesla P100 will utilize 56 SMs. A card utilizing the full 60 SM count is not yet officially known. This is not a gaming chip, but did debut Pascal.

If not already obvious, the biggest move has been to cut GPCs (Graphics Processing Clusters – explained in our P100 article), effectively containers for the TPCs and raster engines, down to two.

Here's the GTX 1060 GP106 block diagram:

GP106 follows all the existing rules and divisions of Pascal, so we see a return of the 128-core simultaneous multiprocessor (SM) and a return of the 5 SM allocation per GPC. This lands us at 1280 CUDA Cores on the GTX 1060's GP106 GPU, or 128 cores * 10 SMs. Like the other GTX chips, GP106 dedicates itself to FP32 single precision compute, leaving double precision FP64 CUDA Cores for the science-class GPUs.

Also like the rest of the GeForce Pascal architecture, GP106 runs 8 TMUs per SM, yielding 80 TMUs on the GTX 1060 (alongside 48 ROPs).

The GTX 1060's memory subsystem features the same algorithmic approach to compression as previous nVidia GPUs, including Maxwell's earlier evolution. Color compression and datapath optimizations are popular with nVidia and AMD right now (see: RX 480 review), leveraged as a means to reduce memory power consumption a non-trivial amount; AMD and nVidia both report upwards of 40% memory reduction per bit transacted, depending on the level of compression possible within a given scene.

Memory capacity on the GTX 1060 is a hard 6GB – only one SKU presently exists – and operates at 8Gbps. There is presently no 3GB SKU, nor any indication that one may legitimately exist. Memory bandwidth can sustain 192GB/s (192-bit / 8 [bits to bytes] * 2000 [memory clock] * 2 [DDR] * 2 [GDDR5] = 192GB/s) on the 192-bit wide interface. The GTX 1060 sticks with GDDR5 memory to help manage cost, and because GDDR5X/HBM would serve no meaningful performance benefit in the face of the weaker chip. Six 1GB dies are present on the GTX 1060 PCB, which is a truncated 6.75” long (full card length: 9.5”).

At a higher level, the GTX 1060 Founders Edition runs a clock-rate of 1506MHz base, 1708MHz boost. The MSI Gaming X card that we've got seems to bounce between 1584MHz (base) and 1999.5MHz (boosted in OC mode).

Let's test this thing. See the next page for methodology.