Finding The Right Modes

If you crave lots of cores and tons of PCIe connectivity, like most content creators, multitaskers, and software developers, then Threadripper is for you. It might also be a good fit if you're a gamer who simultaneously runs heavily threaded productivity applications in the background.

The Zeppelin die really is a feat of modern engineering. However, its architecture is dissimilar from anything that came before, creating issues in some software written prior to Ryzen's introduction. AMD worked with game developers to iron out the performance wrinkles we identified at launch, and we've seen big speed-ups in a number of titles as a result.

But expanding beyond Ryzen 7, 5, and 3 into a dual-die configuration adds a new set of challenges for Threadripper.

AMD's fix involves two toggles that affect how the processor operates, giving you modes optimized for whatever workload you're running. These switches create a total of four unique configurations to choose from. So, in a bid to condense the number of combinations, AMD created its Creator and Game modes.

Why Do We Need Game Mode?

The Zeppelin die consists of two quad-core CPU complexes (CCXes) woven together with the Infinity Fabric interconnect. Even in the single-die Ryzen 7/5/3 processors, this creates a layer of latency that affects communication between the CCXes. AMD builds upon that design with its Threadripper processors, leveraging two active Zeppelin dies. As you might imagine, this introduces another layer of Infinity Fabric latency.

Each die has its own memory and PCIe controllers. So, if a thread running on one core needs to access data resident in cache on the other die, it has to traverse the fabric between those dies and incur significant latency. Naturally, the latency penalty between dies is higher than it is between CCXes in the single-die configurations. To combat the potential for performance regression as a result of its "go-wide" approach, AMD devised an interesting solution: it introduced a new memory access switch that you can toggle via motherboard BIOS or the Ryzen Master software. The Local and Distributed settings flip between either NUMA (Non-Uniform Memory Access) or UMA (Uniform Memory Access).

Image 1 of 6 Image 2 of 6 Image 3 of 6 Image 4 of 6 Image 5 of 6 Image 6 of 6

UMA (Distributed) is pretty simple; it allows both dies to access all of the attached memory. NUMA mode (Local) attempts to keep all data for the process executing on the die confined to its directly attached memory controller, establishing one NUMA node per die. The goal is to minimize requests to remote memory attached to the other die.

NUMA works best if programs are designed specifically to utilize it. Even though most desktop PC software wasn't written with NUMA in mind, performance gains are still possible in non-NUMA applications.

Breaking Games

Some games simply won't load up when presented with Threadripper's 32 threads. That's right, AMD's flagship broke a few titles. The same thing will happen to Intel when its highest-end Skylake-X chips surface shortly.

Out of necessity, AMD created a Legacy Compatibility mode that executes a "bcdedit /set numproc XX" command in Windows. This command cuts the thread count in half. Fortunately, due to the operating system's default assignments, the command disables all of the cores/threads on the second die. That has a side benefit of eliminating thread-to-thread communication between disparate dies, solving the constant latency-inducing synchronization between threads during gaming workloads. It also prevents thread migration, lessening the chance of cache misses.

But What To Test?

The two new toggles give us a menu of options to mess with. AMD's Creator preset exposes 16C/32T and leaves the operating system in Distributed memory access mode. Those settings together should yield excellent performance in most productivity applications. Game mode cuts half the threads via compatibility mode and reduces memory and die-to-die latency with Local memory access.

As early as our first Ryzen review, we found that disabling SMT had a positive impact on some games. However, throwing in another setting to consider expands our list of viable configurations. Of course, some options make little sense for gaming. But they might be interesting to test with normal applications. We narrowed our list down based on Infinity Fabric measurements and prior experience with AMD processors.

Configurations Local (NUMA) / Distributed (UMA) Legacy Mode (On/Off) SMT (Multi-Threading) Creator Mode Distributed Off On Game Mode Local On On Custom - Local/SMT Off Local Off Off Custom - Local/SMT On Local Off On

Disabling SMT in Game mode, which already halves core/thread count, would further cut into our available execution resources. So, we left SMT on and both dies active. We also had the option to leave all cores/threads active, but try isolating memory access to the local memory controller. We found that combination particularly useful in threaded games during our initial review. Of course, we also decided to test using the default Creator and Game modes.

Unfortunately, you can't just throw a switch and fire up your favorite game. Each change requires a reboot, chewing up precious time as you save open projects, halt conversations, and try to remember which web browser tabs to relaunch. Again, Game mode also halves thread count, which isn't good for running heavily threaded applications at full speed in the background while you play. Luckily, the breadth of available options should allow us to find a better configuration that leaves more processing resources at the ready.

AMD designed its Threadripper processors primary for use in enthusiast-oriented PCs, which tend to ship with high-resolution monitors. Higher resolutions are typically GPU-bound, reducing the variance between CPU modes. To better highlight performance trends and maximize our limited testing time, though, we chose to benchmark at 1920x1080, allowing us to spot the subtle differences between AMD's settings.

Overall, Game mode is designed to improve memory latency and help avoid excessive die-to-die Infinity Fabric traffic. Let's start with examining how our settings affect the Infinity Fabric and memory subsystem.



MORE: Best CPUs



MORE: Intel & AMD Processor Hierarchy



MORE: All CPUs Content