Introduction

On May 15, we sat in on a conference call with AMD to learn all about High Bandwidth Memory. If you haven't heard about HBM and our flurry of news posts on the exciting new memory for GPUs, this article will provide you with everything you need to know. HBM is an important step for AMD, as it is powering its upcoming Radeon R9 390X, with its new Fiji architecture.

VIEW GALLERY - 12 IMAGES

AMD has spent the last seven years on the development of HBM, with the motivation behind it being that they wanted to solve one of the bigger problems of the future at the time - bandwidth - which was something that had to be solved, period. Performance-per-watt is very important to both AMD and NVIDIA, and the use of HBM is one of the key ways of getting there.

The development of HBM has taken close to a decade, and has its own unique challenges thanks to it being a very high density interconnect. But, the payoff would allow AMD to build many devices, from smaller APUs to the big bad flagship GPUs that we should expect with the upcoming Radeon R9 390X.

With GDDR5 'entering the inefficient region of the power/performance curve' as you can see in the above shot from AMD's slide, HBM is needed now, more than ever.

GDDR5 is Nearly Dead & Introducing the Interposer

The Problems With GDDR5

It's funny... that right now, all of the flagship GPUs are based around GDDR5. We have no problem with them, with a single flagship card being more than capable of playing all of today's games at 4K 60FPS... but what about tomorrow's games? You have to take another step back, as it's not just about games - it's just as much about the video cards, and what is happening on them, and inside of them, that matters the most.

GDDR5 takes up a considerable amount of space next to the GPU itself, and these chips are not getting any smaller. The higher the bandwidth, the larger number of devices are required to get there. Board real estate is something that is taken up very quickly by GDDR5, which is something that HBM will solve within its first generation.

Then we have the power demands of GDDR5, which require larger voltage regulators - and again, this is something that HBM will fix with its first generation. Shrinking various parts of technology works, but die stacking is something that is coming in very quickly. We're already seeing the benefits of 3D NAND flash in SSDs, but stacking the RAM on GPUs is going to bring in an entire new beast of GPUs that we, until now, didn't think was possible.

Your next question would be "why not make GDDR5 faster?" AMD answers this quite well in the slide above, where if they were to crank up the speeds of GDDR5, it would not only require more power, but the heat generation is only going to go up, and the amount of talk between the logic chip and the DRAM chips is going to increase. Then we have the issue of the growing amount of framebuffer, with cards like the GeForce GTX Titan X featuring 12GB of GDDR5, we can't continue to 24GB, 36GB, and so on with GDDR5. It's next to impossible without video cards doubling in length.

The Interposer

We've been talking about something called 'The Interposer' more and more with our Radeon R9 390X and R9 490X rumors, but now we can go into much more detail that the NDA on this is up. Integrating as much as you can onto the interposer is something that AMD has been wanting to do for a while, but with HBM it's completely possible.

Where HBM comes in handy, is that the DRAM itself can be pulled in much closer to the logic die - as close as technologically possible right now. By shortening the space between the DRAM and logic die, it enables massively wide bus widths - with the Radeon R9 390X rumored to feature a 4096-bit wide memory bus, up from the 512-bit memory bus on the Radeon R9 290X.

Improving the proximity also greatly simplifies communication and clocking, as well as the bandwidth-per-watt. It also paves the way for integration of "disparate technologies", such as DRAM. AMD has worked closely with industry partners such as ASE, Amkor and UMC to develop the first high-volume manufacturable interposer solution.

HBM Explained

HBM is a Space Saver

HBM is such an exciting technology, but how does it work? Well, think of a GPU and many squares around it - these are the GDDR5 chips - but with HBM, the DRAM can be placed right next to the GPU chip itself, saving a considerable amount of space on the PCB.

This image above shows you just how much space will be saved with a HBM stack, with a 1GB chip of GDDR5 taking up 672mm2 of space, while the same 1GB HBM will take up just 35mm2 of space - saving a huge 1900% space. Incredible, isn't it?

We also have 9900mm2 PCB footprint for the Hawaii XT-based Radeon R9 290X which uses GDDR5, but for an HBM-based ASIC, we're looking at less than 4900mm2, which is more than 50% smaller. So, just in PCB space alone, we're saving a considerable amount of physical space.

Increased Speeds & Improved Power Efficiency with HBM

Now that we can see just how much PCB space can be saved by using HBM, which is paramount, what type of performance improvements can we expect over GDDR5? Well, a considerable improvement, that's what.

As you can see from the slide above, GDDR5 has a 32-bit wide memory bus per package, while providing up to 1750MHz (7Gbps). HBM cranks things up by magnitudes, with 1024-bit wide memory bus with up to 500MHz (1Gbps). With four chips of DRAM used, the 1024-bit bus is multiplied by four up to 4096-bit, which is what the Radeon R9 390X is expected to pack.

GDDR5 has 28GB/sec of bandwidth per chip, while HBM really starts to stretch its legs in its first generation with over 100GB/sec per stack. As for power consumption, GDDR5 uses 1.5V and HBM uses 1.3V.

Whilst GDDR5 has just 10.66GB/sec of bandwidth per watt of power used, HBM has over 35GB/sec of bandwidth per watt. This means the HBM and interposer provide many times more bandwidth than GDDR5, all while using over 50% less power. AMD says that "HBM rebalances DRAM vs. logic power consumption to protect future GPU performance growth".

AMD's Radeon R9 390X

Now We Know About HBM, Let's Talk the Radeon R9 390X

Now that we've had a quick dip into the HBM pool, we need to remind you: this is the first generation of HBM. HBM2 will double the bandwidth up to over 1.2TB/sec, the first time in history that we've had GPU memory - or any consumer memory - with over 1TB/sec of bandwidth.

We've been posting a considerable amount of rumors, reports and insider information about the Radeon R9 390X, which will be the first video card to use HBM. AMD has been working with SK Hynix and many other companies to get this off the ground, and the fruits of that labor will be the Fiji XT-based Radeon R9 390X.

What we do know so far, are just from rumors. The Radeon R9 390X should have between 4-8GB of HBM-based VRAM, and arrive in two versions: one with a normal air cooler, while another will arrive with a hybrid water cooling system.

Furthering the rumors is talk of two versions of the Radeon R9 390X, one with GDDR5 and another with HBM. Then we have the rumors of the rebrands, which will rebrand the Hawaii XT-based Radeon R9 390X as a card in the upcoming Radeon R9 300 series.

Final Thoughts

HBM is going to usher in some of the most exciting, very memorable years of technology we've ever had. Up until this point, we've been starved of new technology. We've stuck on the 28nm node right now on both sides of the camp, with AMD's Radeon R9 390X GPU to be built on 28nm.

The super-fast new VRAM is going to allow AMD to make their HBM-based video cards much shorter, which is something we wrote about not too long ago now. A shorter flagship card that is 40-50% faster than its predecessor and has 100% more memory bandwidth? Yeah, I thought so. The Radeon R9 390X could be around 20-30% faster than the Titan X in some applications and games, especially when it comes to high resolutions like 4K and beyond.

As a technology enthusiast, and the position I hold here at TweakTown to review video cards for a living, I cannot explain in words how excited I am for HBM. HBM is going to change the way video cards are made, and even more so when we shift into HBM2.

HBM2 will drive memory bandwidth to over 1TB/sec which is just insane to think of right now, with cards delivering around 300-350GB/sec right now and still providing 4K 60FPS gaming. When HBM2 starts becoming available sometime in 2016, we are going to see AMD and NVIDIA both release new nodes, shifting into 16nm. The move into 16nm is going to be just as exciting as HBM, but HBM2 will be here by then so the excitement begins all over again.

AMD will be the first to the market with a HBM-based video card, which makes the Radeon R9 390X one of the most exciting GPU launches of all time. We've waited over 18 months for AMD to launch a new GPU architecture, and to do so with HBM attached? Glorious.