Memory has always been one of the more important components of a graphics card. Its importance, in recent times has become even more prominent – particularly thanks to the advent of HBM technology. While the focus of the industry has shifted to that of maximizing bandwidth while retaining an acceptable physical footprint – there lurks in the shadows a menace that could turn out to have far reaching consequences: power consumption of memory. I thought it was time that we do a short editorial on the effects of bandwidth and power consumption.

The untold story of memory, bandwidth and power consumption in modern graphic cards

As most of you will know, the memory standard for graphic cards of the past generation is GDDR5, and while many future offerings will undoubtedly retain this storage type, there is a new player in town: High Bandwidth Memory. HBM or stacked DRAM (which comes in many different names and flavors) is a recent innovation - with significantly increased bandwidth but drastically lowered power consumption. The entire point of HBM was to bring high bandwidth to an affordable level (in terms of power cost). And while they have pretty much succeeded in doing that for the near future, the problem persists when talking about the long term horizon. Nvidia had their GPU Technology Theater conference at SC15 a while back and Dr. Stephen W. Keckler, Senior Director of Architectural Research (at Nvidia) mentioned that the problem of the memory power consumption remains unsolved - even with HBM.

The graph shows an exponentially increasing trend in power consumption of the two memory types - as bandwidth increases. We can see the structural break in the graph where the industry shifts from GDDR5 to HBM, but the end result remains the same - albeit delayed. Now why is this a problem? one might ask. In terms of the TDP enthusiast rigs are used to - these numbers are well within range. But you have to realize that a GPU which would require so much bandwidth would consume quite a lot on its own too - and the combined TDP will not be within acceptable ranges. The energy consumed by a GPU is a very important number, and if just the memory requires such a large amount of power than the total number would be pretty huge.

The power of graphic cards is increasing at an unrelenting pace which means that within a few more years we will require bandwidth in the region of a few thousand GB/s - which would mean that even with the energy efficiency afforded by stacked DRAM, memory power consumption will rise above the 100W mark - which considering the fact that the GPU themselves will be sipping quite alot power on their own - means total wattage above the 250W mark high end cards currently employ.

Some of our readers reading this will probably be wondering why we need bandwidth in the thousands of GB/s. Well, you have to remember the fact that graphic cards are actually processors that have huge clusters of incredibly small cores. You can call them CUDA cores, or you can call them stream processors - either way, it is these tiny processors that require the bandwidth. Now while the number of these cores has been increasing at a very steady rate, the bandwidth afforded to them has not been able to keep pace. To elaborate the statement, we calculated the bandwidth available per core of recent Nvidia and AMD graphic cards. A very clear trend is immediately visible.

Bandwidth per core analysis of Nvidia and AMD graphic cards

Bandwidth per core of different Radeons, with the number of stream processors in ascending order.

Bandwidth per core of various Geforce cards, with the number of CUDA cores sorted in ascending order.

You will notice that the bandwidth available to each core decreases as the GPU grows more powerful in nature. This is because of the inherently slow pace of current memory technology. Interestingly, we can see that AMD has been able to keep the gradient of the Bandwidth Per Core relatively constant while as Nvidia has just a slightly steeper gradient. Nvidia's card range in the interval of 107MB/s to 120 MB/s per core going as high as 160 MB/s per core on the GTX 760. This is ofcourse something we have known all along - AMD is more generous with allotting bandwidth to its graphic cards than Nvidia - which uses color optimization technology to compensate for the lack of bandwidth. AMD cards range from 110 MB/s to 140 MB/s per core, going as high as 175 MB/s per core for the R9 370. Interestingly, the only card in our calculations which dropped below the 100 MB/s per core mark is the R9 285 - which has a number of 98 MB/s per core.



Lets explore the power dilemma a bit more. The power consumption of volatile memory is a function of the clock speed and bus width (aka bandwidth). According to a statement by AMD, the latest version of GDDR5 (on average) uses about 1 watt per 10.66 GB/s of bandwidth whileas HBM is good for about 35 GB/s per watt. This is a phenomenal, nearly 3.5x, increase in power efficiency - but is it enough? We computed the approximate power consumption of the various graphic cards and arrived at some very interesting conclusions.

Memory power consumption trend of AMD and Nvidia graphic cards

As expected with Nvidia, the power consumed by the memory is increasing at an approximately proportional rate to the average power of the graphic card. Keep in mind that cores across generations are not completely comparable (the average Maxwell core is 15%-32% more powerful depending on the card) so the actual number (after adjusting for performance differences) would be slightly higher. We see that on average an AMD card uses around 15W to 36W to power its GDDR5 memory standard whileas Nvidia cards use around 18W-32W depending on the exact card. With GDDR5 - the maximum amount of wattage so far is near the 40W mark (after accounting for real world line losses). With HBM however, things take an altogether different turn. HBM memory uses only 15Ws and provides more bandwidth than the highest clocked GDDR5. Infact if you look at the graph you will notice that in the case of the Fury lineup, the curve breaks from its expected path and actually slopes downwards - indicating the structural break in trend we talked about.

By sticking to the HBM standard we can compute what the bandwidth will be like at the power levels stated by Nvidia. At 120W an HBM (at today's efficiency) will be able to output 4200 GB/s - which is an absolutely huge number. Unfortunately however, efficiency does not scale linearly and we might as well be looking at a factor of 20-25 GB/s per watt at thorough puts that high. Which gives us a number around 2400 GB/s - a much more reasonable number which we can expect to see in the near future. Currently, the power consumption of the memory ranges anywhere from 8-15% of the total TDP of a GPU but as the bandwidth increases - this number will go up as well. Ofcourse, there are many alternative technologies already in the pipeline - including standards being worked on by Intel, Rambus and Micron. This projection does not (and cannot) take into account disruptive new technologies which reset the curve once more - a possibility that is more than likely.

Nvidia Geforce - Memory Analysis

Model CUDA Cores Bandwidth Bandwidth per Core Memory TDP Total TDP % of TDP GeForce GTX 760 1152 192.3 GB/s 0.1669 GBpc 18W 170W 10.6% GeForce GTX 760 Ti 1344 192.3 GB/s 0.1431 GBpc 18W 170W 10.6% GeForce GTX 770 1536 224 GB/s 0.1458 GBpc 21W 230W 9.1% GeForce GTX 780 2304 288.4 GB/s 0.1252 GBpc 27W 250W 10.8% GeForce GTX 780 Ti 2880 336.5 GB/s 0.1168 GBpc 32W 250W 12.6% GeForce GTX Titan 2688 288.4 GB/s 0.1073 GBpc 27W 250W 10.8% GeForce GTX Titan Black 2880 336.5 GB/s 0.1168 GBpc 32W 250W 12.6% GeForce GTX Titan Z 5760 336.5 GB/s 0.0584 GBpc 32W 375W 8.4% GeForce GTX 750 512 80 GB/s 0.1563 GBpc 8W 55W 13.6% GeForce GTX 750 Ti 640 88 GB/s 0.1375 GBpc 8W 60W 13.8% GeForce GTX 950 768 106 GB/s 0.1380 GBpc 10W 90W 11.0% GeForce GTX 960 1024 112 GB/s 0.1094 GBpc 11W 120W 8.8% GeForce GTX 970 1664 196 GB/s + 28 GB/s 0.1178 GBpc 21W 145W 14.5% GeForce GTX 980 2048 224 GB/s 0.1094 GBpc 21W 165W 12.7% GeForce GTX 980 Ti 2816 336 GB/s 0.1193 GBpc 32W 250W 12.6% GeForce GTX Titan X 3072 336 GB/s 0.1094 GBpc 32W 250W 12.6%

AMD Radeon - Memory Analysis