Samsung has begun mass production of 20nm second-generation High Bandwidth Memory (HBM2), which features up to 256 gigabytes-per-second (GB/sec) of available bandwidth per memory stack—double that of the first generation HBM used in AMD's Fury graphics cards.

HBM2 will also allow for HBM-equipped graphics cards to be packed with more memory—as much as 16GB—surpassing the 4GB limit that AMD ran into with its early adoption of the technology.

Like HBM, HBM2 is a form of stacked memory, where the individual DRAM chips are placed on top of each other, rather than side by side. Those chips are connected together vertically using through-silicon vias (TSVs)—wires that are threaded through the DRAM stack—while an interposer at the bottom of the stack routes the connections from the memory directly to the GPU. Because the chips are closer together and the interconnects are shorter, throughput is increased and power consumption is reduced.

First-generation HBM was limited to 128GB/s of bandwidth using four 2Gbit chips (for 1GB total) per stack, attached to a 1024-bit-wide bus running at 1.2V. AMD's Fury cards used four of these stacks to achieve total available bandwidth of 512GB/sec and 4GB capacity.

By comparison, Samsung's HBM2 solution consists of four 8Gb chips (for 4GB total) with 256GB/s of bandwidth per stack, attached to the same 1024-bit interface at 1.2V. In theory, this would allow for graphics cards with 16GB of memory and a huge 1024GB/s of memory bandwidth—double the Fury X, and three times as much as Nvidia's Titan X.

While it currently has 4GB stacks rolling off the production line, Samsung also plans to produce 8GB stacks later in the year, likely by making each stack eight chips high rather than four. The JEDEC HBM2 standard that Samsung is working from—which was only just ratified earlier this month—specifies that stacks can be two, four, or eight chips high.

The potential improvement that HMB2 brings to graphics cards is enormous. In addition to the massive available bandwidth, which should help greatly with the high frame rates demanded by VR, HBM2 also uses much less power. Graphics card makers can plough those power savings back into the GPU, while keeping the whole graphics card within the same power envelope. AMD claimed it reduced memory power consumption in its Fury X by as much as 40-50W thanks to using HBM.

Both AMD and Nvidia are set to use HBM2 in future products: AMD with its Polaris architecture, and Nvidia with Pascal. Reports from the tail end of last year even claimed that Nvidia was sourcing its HBM2 from Samsung.

While HBM2 will have a dramatic effect on graphics card performance this year, especially when we also factor in GPUs moving from 28nm to 14nm, HBM2 could also give AMD's APUs a significant performance boost. Memory is a significant bottleneck for on-board graphics performance. Intel has had some success with the EDRAM used in some of its processors, but integrating HBM2 onto a chip would dramatically increase the bandwidth available to the GPU. While it's unlikely a HBM2-equipped APU will challenge a high-end graphics card, low- and mid-range cards may soon feel the squeeze of redundancy.