AMD began its Zen-aissance with the first-generation Ryzen CPUs in 2017, proving that it was not to be discounted from the high-performance CPU race just yet. While those CPUs weren’t perfect, they offered high-end desktop core counts at formerly unheard-of prices.

Barely a year later, AMD launched the second-generation Ryzen CPUs. Smarter dynamic voltage and frequency scaling, a lower-latency memory controller, and higher peak clock speeds went a long way to make Ryzen more attractive to gamers and other folks who may not necessarily have had much use for a whole pile of processor cores.

Now, AMD is taking a step rarely seen in the history of CPUs: it’s migrating to a next-generation semiconductor fabrication process ahead of arch-rival Intel. With TSMC’s 7-nm foundries at its disposal, AMD has used this genuine, Moore’s Law-compliant advance in transistor density and performance to introduce a family of processors powered by its Zen 2 architecture.

The specific processors that AMD provided for us are the Ryzen 7 3700X and the Ryzen 9 3900X. You can see the most important specifications for these chips in the chart above. As you’re probably aware, these CPUs are not only the first releases with AMD’s new Zen 2 processor core, but they’re also the first CPUs from AMD assembled using multiple heterogeneous “chiplets” in a single package. Get used to that word—chiplet—because we suspect you’re going to be hearing it quite a bit over the next few years.

Chip(let)s’ challenge

So what is a “chiplet?” From the name, you can infer that it’s a little chip. Specifically, AMD refers to its new bits of silicon as chiplets because they are not traditional monolithic processors that function independently. Instead, one of these third-generation Ryzen CPUs is based on two or three chiplets from two different types. One type of chiplet is the CCD, or “Core Chiplet Die,” and the other is the IOD (the “I/O die”). The actual processing happens on one or more CCDs, while the IOD contains the memory controller, high-speed I/O, and other functions.



A diagram of a Socket AM4 third-generation Ryzen CPU. Source: AMD

This change was likely spurred by a number of factors. Notably, it allows AMD to use the same CCD chiplets for every single product across its range. While the company only explicitly names “client products” such as the ones we’re looking at today, all signs point to AMD re-using these same chiplets as one brick in the foundation for its next-generation Epyc CPUs, code-named Rome.

Along similar lines, there’s nothing stopping the company from sticking these same CCDs in everything from game consoles to ultra-mobile processors. In the end, this move allows AMD to improve yields, density, and scalability at the cost of drastically increased design complexity.

Snarky internet commenters have already pointed out that this is not really all that different from the way things used to work when we had both north- and south-bridge chips on our motherboards. The difference between a distant chip on the motherboard and a separate chiplet on the same package is monumental, though. AMD says that “from the perspective of cache and memory access” these new processors “behave monolithically” aside from 1-2 nanoseconds of wire latency on cache accesses. We’ll see if that’s true when we get to our performance testing, but let’s talk a little bit more about Zen 2 first.

Zen, once more with feeling

Don’t be confused: even though these are the third-generation Ryzen processors, the CPU cores inside are based on the “Zen 2” architecture. That design is itself a major revision of the Zen+ architecture inside the second-generation Ryzen processors. Despite the radical shift in processor design to a chiplet-based paradigm, the most pertinent changes in these processors, for our purposes, are those made to the cores and caches.



A diagram of a Zen 2 processor core. Source: AMD

I’m not a CPU architect, so some of the modifications that AMD made go over my head. Still, there are a couple of changes that are easy to understand. Firstly, AMD doubled the width of the core’s AVX units, allowing them to handle 256-bit floating-point data in a single operation. That change alone doubles the speed of Zen 2 cores on crunchy vector math operations compared to their forebears, and brings the core design up to par in that regard with Intel’s desktop chips. More and more applications are starting to use AVX instructions to accelerate vector math operations, so this is a big deal.

Even more importantly, AMD doubled the L3 cache on Zen 2. As with Zen and Zen+, Zen 2’s CPU cores are further subdivided within an 8-core CCD into two quad-core Core CompleXes (CCX). On previous-generation designs, each CCX had 8MB of L3 cache to call its own, but on Zen 2 each CCX now has 16MB of L3 cache. Doing simple arithmetic, that gives parts with one CCD 32MB of L3 cache, and parts with two CCDs fully 64MB of L3 cache. It’s possible AMD that could disable part of the cache for future CPUs, but all of the parts launching today are fully enabled.

AMD specifically says this change was made possible by the move to 7nm fabrication. That’s easy to believe, because 32MB of cache is a huge chunk of the CCD chiplet. Microprocessor makers don’t dedicate enormous swaths of silicon to specific features without a good reason, and we think the big block of cache is there to help mitigate a memory latency penalty brought on by these CPUs’ unique packaging.

AMD is so impressed by the potential difference that the extra cache makes that it has a marketing buzzword for the feature: AMD GameCache. If you see it in the future, know that it doesn’t really mean anything; it’s just branding born from the idea that Zen 2-based CPUs see a significant boost—14%, according to AMD—in game performance as a direct result of the larger cache.

Zen 2 also brings support for the UEFI Collaborative Power and Performance Control (CPPC2) interface. This is not unlike a vendor-neutral version of Intel’s SpeedShift technology. When CPPC2 is enabled, third-generation Ryzen processors take control of their per-core clock rates away from the OS and manage it themselves. Instead of requiring some 30ms to ramp up CPU core clocks in response to incoming work, the CPU can instead hit top speed within 1-2ms. This is a tremendous benefit for brief and bursty workloads such as webpage rendering and application launches. As it requires OS support, using this feature requires the very latest version of Windows 10.

Other major changes in Zen 2 include a double-size micro-op cache (up to 4K), an all-new “tagged geometric” branch predictor, an extra address generation unit, and a larger 180-entry register file. AMD also says it has improved fetch and pre-fetch capabilities, and generally improved load/store bandwidth throughput the device. Overall, Zen 2 looks to shore up the weaknesses of Zen+, and AMD claims that the new chips’ instructions-per-clock (IPC) is up by 15% over second-generation Ryzen. If you’re thirsty for more specific details about the changes in Zen 2, hit up Wikichip Fuse’s in-depth article.