Update 05/24: NVIDIA has since reached out to us, informing us that their previous statement about 32GB cards replacing 16GB cards was in error, and that the 16GB V100 SKUs will remain in production. For more details, please see this article.

When first launched last year, the original Tesla V100 shipped with 16GB of HBM2 memory. Now just a little less than a year into its lifetime, NVIDIA is announcing that their workhorse server accelerator is getting a memory capacity bump in the form of new 32GB SKUs, effectively immediately.

For the last couple of years now, NVIDIA has been relying on 4GB (4-Hi) HBM2 memory stacks for their Tesla P100 and Tesla V100 products, as this was the first HBM2 memory to be ready in reasonable commercial volumes. Now that Samsung and SK Hynix have a better grip on HBM2 manufacturing, 8GB (8Hi) HBM2 stacks are far more readily available and reliable. As a result, the conditions are right for NVIDIA to finally give their Tesla cards a long-awaited memory upgrade.

This upgrade will be across the entire Tesla V100 family – all SKUs for SMX and PCIe cards are getting 32GB SKUs. The Tesla V100’s specifications otherwise remain identical, with the same GPU and memory clocks along with the same TDPs. We’re also told that the mechanical specifications are identical as well, which would mean that the 8-Hi stacks won’t cause any cooling problems due to changes in memory stack height.

NVIDIA Tesla/Titan Family Specification Comparison Tesla V100

(SXM2) Tesla V100

(PCIe) Titan V

(PCIe) Tesla P100

(SXM2) CUDA Cores 5120 5120 5120 3584 Tensor Cores 640 640 640 N/A Core Clock ? ? 1200MHz 1328MHz Boost Clock 1455MHz 1370MHz 1455MHz 1480MHz Memory Clock 1.75Gbps HBM2 1.75Gbps HBM2 1.7Gbps HBM2 1.4Gbps HBM2 Memory Bus Width 4096-bit 4096-bit 3072-bit 4096-bit Memory Bandwidth 900GB/sec 900GB/sec 653GB/sec 720GB/sec VRAM 16GB

32GB 16GB

32GB 12GB 16GB L2 Cache 6MB 6MB 4.5MB 4MB Half Precision 30 TFLOPS 28 TFLOPS 27.6 TFLOPS 21.2 TFLOPS Single Precision 15 TFLOPS 14 TFLOPS 13.8 TFLOPS 10.6 TFLOPS Double Precision 7.5 TFLOPS 7 TFLOPS 6.9 TFLOPS 5.3 TFLOPS Tensor Performance

(Deep Learning) 120 TFLOPS 112 TFLOPS 110 TFLOPS N/A GPU GV100 GV100 GV100 GP100 Transistor Count 21B 21B 21.1B 15.3B TDP 300W 250W 250W 300W Form Factor Mezzanine (SXM2) PCIe PCIe Mezzanine (SXM2) Cooling Passive Passive Active Passive Manufacturing Process TSMC 12nm FFN TSMC 12nm FFN TSMC 12nm FFN TSMC 16nm FinFET Architecture Volta Volta Volta Pascal

It should be noted that this upgrade is a wholesale replacement of the existing 16GB versions in NVIDIA’s product stack, as NVIDIA won’t be retaining the 16GB versions now that the 32GB versions are out. So all new cards sold by NVIDIA going forward will be the 32GB cards, and OEMs will be making the same transition in Q2 as their 16GB stocks are depleted and replaced with 32GB cards.

NVIDIA’s Tesla V100-equipped systems will also be getting the same upgrade. Both the DGX-1 server and DGX Station will now ship with the 32GB cards; and indeed these are the only capacities that will be offered for those systems going forward. And while OEMs are a bit farther behind, ultimately I expect they’ll make the same move since the 16GB cards are going away anyhow . Meanwhile NVIDIA isn’t officially talking about pricing here for either the new V100 cards or the updated DGX systems. However as the new parts are essentially drop-in replacements in their respective ongoing lines, there aren’t any signs right now that pricing is changing.

As far as workloads go, since the specifications of the Tesla V100 aren’t changing outside of memory capacity, the benefit and performance impact of this upgrade is almost entirely rooted in dataset size. Workloads that rely on datasets that couldn’t previously fit entirely within a Tesla V100 card should see a sizable boost, as will any applications that can benefit from more local memory for caching purposes. However in pure compute-bound scenarios, the new cards shouldn’t perform any differently than the outgoing models.

Meanwhile, as for NVIDIA's workstation-oriented Titan V, nothing is being announced at this time. The 12GB card could easily be bumped to 24GB using the 8-Hi stacks, however it's not clear if NVIDIA is in a hurry to do so. In practice it will almost certainly come down to whether NVIDIA wants to keep GV100 GPU production to a single line for efficiency reasons, or if they'll operate two lines as part of a broader product segmentation strategy.

Lastly, this upgrade means that NVIDIA has finally caught up with arch-rival AMD in total memory capacity. AMD has been shipping a 32GB FirePro S9170 card using 16x2GB GDDR5 modules for the better part of the last 3 years now, a mark that NVIDIA couldn’t catch up to until 8-Hi HBM2 was ready. The Tesla V100 has always been significantly more powerful than the W9100 regardless, but this finally brings NVIDIA to parity in the one area where they were trailing AMD in the server space.