Modern computers use many different kinds of memory: DDR4, GDDR5, GDDR6, LPDDR4, HBM, etc. While these are all based on DRAM, there are some key differences between them. DDR4 is used in most PCs as the main memory and is the most popular form of DRAM. GDDR5 and GDDR6 are used in graphics cards as dedicated graphics memory. Although it’s also based on DRAM, it’s somewhat different from DDR4.

Many people get confused between the two and use them interchangeably. There’s also LPDDR4 memory used in smartphones and other mobile devices, and HBM utilized in servers and exascale computers. In this post, we explore the differences between DDR4 and GDDDR5 memory along with a brief explanation of HBM, LPDDR4 and the newer GDDR6 standard.

Double Data rate Generation Four (DDR4)

Nearly every kind of memory is based on dynamic random access memory or DRAM. It ‘s slower than static ram (SRAM) as it has to be refreshed continuously by the memory controller. At the same time, it is much more affordable which is the primary reason for its widespread use. SRAM is used as the cache memory in GPUs and CPUs as it’s much faster and efficient compared to DRAM.

DDR4 is the latest iteration of DRAM. Released in 2014, it initially focused on reducing the voltage and power consumption rather than increasing the operating frequencies. With the coming of AMD’s Ryzen processors and the MCM design , high-speed DDR4 memory has suddenly become more relevant. DDR4 memory modules capable of running at 3600MHz out of the box are now widely available while some can even be pushed to as high as 5000MHz.

DDR3 vs DDR4

Aside from the obvious (faster frequencies and lower latency), the primary advantages of DDR4 memory over DDR3 are higher DIMM sizes (up to 64 GB, DDR3 is limited to 16GB). It also draws considerably less power and runs at a lower voltage.

Unlike the transition from DDR2 to DDR3, the move to DDR4 didn’t increase the burst length or prefetch. Both DDR3, as well as DDR4, has a burst length of 8 and an 8n prefetch.

However, there is one key difference in the memory bank groups of DDR3 and DDR4 memory. As you can see above, DDR3 has an 8n prefetch with four memory arrays forming a bank group connected via a multiplexer to the I/O controller.

Although DDR4 maintains the same 8n prefetch and burst length as DDR3, it has two memory bank groups per channel. The two bank groups are separate and can execute two independent 8n prefetches. This is done by using a multiplexer to time division multiplex its internal banks. Therefore, the effective prefetch for DDR4 is wider than DDR3.

With that out of the way, let’s talk about GDDR5 memory, the predominant memory standard on video cards (now replaced by GDDR6).

DDR4 Vs GDDR5

Both DDR4 and DDR3 use a 64-bit memory controller per channel which results in a 128-bit bus for dual-channel memory and 256 bit for quad -channel. GDDR5 memory, on the other hand, leverages a puny 32-bit controller per channel.

-channel. GDDR5 memory, on the other hand, leverages a puny 32-bit controller per channel. Where CPU memory configurations have wider but fewer channels, GPUs can support any number of 32-bit memory channels. This is the reason many high-end GPUs like the GeForce RTX 2080 Ti and RTX 2080 have a 384-bit and 256-bit bus width, respectively.

Both DDR4 and DDR3 use a 64-bit memory controller per channel which results in a 128-bit bus for dual-channel memory and 256 bit for quad-channel. GDDR5 memory, on the other hand, leverages a puny 32-bit controller per channel.

Both the RTX 20 series cards are connected to 1GB memory chips via 8 (for 2080) and 12 (for the Ti) 32-bit memory controllers or channels. GDDR5/6 can also operate in what is called clamshell mode, where each channel instead of being connected to one memory chip is split between two. This also allows manufacturers to double the memory capacity and makes hybrid memory configurations like the GTX 660 with its 192-bit bus width possible.



The GTX 670 has four 512 MB chips across eight channels

A GTX 660 Ti has four memory stacks, the ones on top (packing two per stack) in clamshell mode. This reduces the bus width to 192-bit rather than 256-bit

A GTX 660 PCB

clamshell mode

Another core difference between DDR4 and GDDR5/6 memory involves the I/O cycles. Just like SATA , DDR4 can only perform one operation (read or write) in one cycle. GDDR5 can handle input (read) as well as output (write) on the same cycle, essentially doubling the bus width.

All this might put DDR4 memory in a bad light, but this configuration actually suits both setups. CPUs are largely sequential in nature while GPUs run thousands of parallel cores. The former benefits from low latency and slimmer channels, while GPUs require a much higher bandwidth with loose timings.

GDDR5 vs GDDR5X vs GDDR6

GDDR6 was preceded by GDDR5X which was more of a half-generation upgrade of sorts. GDDR5X features transfer rates of up to 14GBit/s per pin, twice as much as GDDR5.

This was achieved by using a higher prefetch. Unlike GDDR5, GDDR5X has a 14n prefetch architecture (vs 8n on G5). This allows it to fetch 64-bytes (512-bits) of data per cycle while GDDR5 was limited to 32-bytes.

GDDR6, like GDDR5X, has a 16n prefetch but it’s divided into two channels. So GDDR6 fetches 32 bytes per channel for a total of 64 bytes just like GDDR5X and twice that of GDDR5. While this doesn’t improve memory transfer speeds over GDDR5X, it allows for more versatility.

GDDR6 can fetch the same amount of data as GDDR5X but in two separate channels, allowing it to function like two smaller chips instead of one, in addition to a wider single one.

Other than that, GDDR6 also increased the density to 16Gb (2x compared to GDDR5X) and significantly improves bandwidth by increasing the base clock from 12Gbps to up to 16Gbps.

High Bandwidth Memory (HBM)

First popularized by AMD’s Fiji graphics cards, high bandwidth memory or HBM is a low power memory standard with a wide bus. HBM achieves substantially higher bandwidth compared to GDDR5 while drawing much lesser power in a small form factor.

HBM adopts clocks as low as 500 MHz to conform to a low TDP target and makes up for the loss in bandwidth with a massive bus (usually 4096 bits). AMD’s Radeon RX Vega cards are the best example of HBM2 implementation in consumer hardware. HBM2 solved the 4GB limit of the HBM1, but limited yields coupled with memory shortage prevented AMD from capitalizing on the consumer GPU front.

LPDDR4 vs DDR4

LPDDR4 is the mobile equivalent of DDR4 memory. Compared to DDR4, it offers reduced power consumption but does so at the cost of bandwidth. LPDDR4 has dual 16-bit channels resulting in a 32-bit total bus. In comparison, DDR4 has 64-bit channels.

However, at the same time, LPDDR4 has a prefetch of 16n per channel for a total of (16 words x 16 bit) 256 bits/32 bytes. That results in an overall of 512 bits or 64 bytes for both the channels.

DDR4, on the other hand, has two 8n prefetch banks per channel. The two banks are separate and can execute two independent 8n prefetches. This is done by using a multiplexer to time division multiplex its internal banks.

Compared to DDR4, LPDDR4 offers reduced power consumption but does so at the cost of bandwidth. LPDDR4 has dual 16-bit channels resulting in a 32-bit total bus. In comparison, DDR4 has 64-bit channels.

LPDDR4 also has a more flexible burst length ranging from 16 to 32 (256 or 512 bits/ 32 or 64 bytes. DDR4, on the other hand, is limited to 8 bursts per cycle (or 128 bits), although each bank can perform additional transfers.

To understand what burst-length means, you need to know how memory is accessed. When the CPU or cache requests new data, the address is sent to the memory module and the needed row, then the column is located (if not present, a new row is loaded). Keep in mind that there’s a delay after every step. After that, the entire column is sent across the memory bus, but instead in bursts. For DDR4, each burst was 8 (or 16B). With DDR5, it has been increased to as much as 32 (up to 64B). There are two bursts per clock and they happen at the effective data rate.

This design makes LPDDR4 much more power-efficient compared to standard DDR4 memory, making it ideal for use in smartphones with battery standby times of up to 8-10 hours. Micron’s LPDDR4 RAM tops out the standard with a 2133 MHz clock for a transfer rate of 4266 MT/s while Samsung follows shortly after with a clock of 1600MHz and a transfer rate of 3200 MT/s.