As you may know,

a while back, we came to some difficult realizations about the validity of our methods for testing PC gaming performance. In my article Inside the second: A new look at game benchmarking , we explained why the widely used frames-per-second averages tend to obscure some of the most important information about how smoothly a game plays on a given system. In a nutshell, the problem is that FPS averages summarize performance over a relatively long span of time. It’s quite possible to have lots of slowdowns and performance hiccups during the period in question and still end up with an average frame rate that seems quite good. In other words, the FPS averages we (and everyone else) had been dishing out to readers for years weren’t very helpful—and were potentially misleading.

To sidestep this shortcoming, we proposed a new approach, borrowed from the world of server benchmarking, that focuses on the actual problem at hand: frame latencies. By considering the time required to render each and every frame of a gameplay session and finding ways to quantify the slowdowns, we figured we could provide a more accurate sense of true gaming performance—not just the ability to crank out lots of frames for high averages, but the more crucial ability to deliver frames on time consistently.

Some good things have happened since we proposed our new methods. We’ve deployed them in a host of graphics card reviews, and they have proved their worth, helping to uncover some performance deficiencies that would have otherwise remained hidden. In response to your feedback, we’ve refined our means of quantifying the latency picture and presenting the info visually. A few other publications have noticed what we’re doing and adjusted their own testing methods; even more have quietly inquired about the possibility behind the scenes.

Most importantly, you, our readers, have responded very positively to the changes, even though we’ve produced some articles that are much more demanding reading than your average scan-and-skip-to-the-conclusion PC hardware review.

We have largely missed one important consequence of our insights, though. Latency-focused game testing doesn’t just apply to graphics cards; it’s just as helpful for considering CPU performance. We made a quick down payment on exploring this matter in our Ivy Bridge review, but we haven’t done enough to pursue it. Happily, that oversight ends today. Over the summer, we’ve tested 18 different PC processors from multiple generations in a range of games, and we can now share the results with you.

Before your eyes glaze over from the prospect of data overload, listen up. The results we’ve compiled confront a popular myth: that PC processors are now so fast that just about any CPU will suffice for today’s games, especially since so many titles are console ports. I’ve said something to that effect myself more than once. But is it true? We now have the tools at our disposal to find out. You may be surprised by what we’ve discovered.

The contenders

Yes, we really have tested 18 different desktop CPUs for this article. They break down into several different classes, delineated mainly by price. We have a full complement of the latest chips on hand, including several members of Intel’s Ivy Bridge lineup and a trio of AMD FX processors. We’ve tested them against their predecessors in the past generation or two, to cover a pretty big swath of the CPUs sold in the past several years. Allow me to make some brief introductions.

Quite a few PC enthusiasts will be interested in the first class of CPUs tested, which is headlined by the Core i5-3470 at $184. This Ivy Bridge-based quad-core replaces the Sandy Bridge-derived Core i5-2400 at the same price. The newer chip has slightly faster clocks and a lower power envelope—77W instead of 95W—versus the model it supplants. Two generations back, this price range was served by the Core i5-655K, a dual-core chip. The closest competing offering from AMD is the FX-6200 at $155, a six-core part based on the Bulldozer architecture. The FX-6200’s precursor was the Phenom II X4 980, which we’ve also invited to the festivities.

For a little more money, the next class of CPUs promises even higher performance. Intel’s Ivy Bridge offering in this range is the Core i5-3570K for $216, with a fully unlocked multiplier to ease overclocking. The 3570K replaces an enthusiast favorite, the Core i5-2500K, again with slightly higher clock speeds and a lower thermal design power (or TDP). This is also the space where AMD’s top Bulldozer chip, the FX-8150, contends. The legacy options here are a couple of 45-nm chips, the Core i5-760 and the Phenom II X6 1100T.

More relevant for many of us mere mortals, perhaps, are the lower-end chips that sell for closer to a hundred bucks. AMD’s FX-4170 at $135 gets top billing here, since our selection of Intel chips skews to the high end. We think the FX-4170 is a somewhat notable entry in the FX lineup because it boasts the highest base and Turbo clock speeds, even though it has fewer cores. The FX-4170 supplants a lineup of chips known for their strong value, the Athlon II X4 series. Our legacy representative from that series actually bears the Phenom name, but under the covers, the Phenom II X4 850 employs the same silicon with slightly higher clocks.

Finally, we have the high-end chips, a segment dominated by Intel in recent years. We’ve already reviewed the Ivy-derived Core i7-3770K, a $332 part that inherits the spot previously occupied by the Core i7-2600K and, before that, by the Core i7-875K. Also kicking around in the same price range is the Core i7-3820, a fairly affordable Sandy Bridge-E-based part that drops into Intel’s pricey X79 platform. The Core i7-3820’s big brother is a thousand-dollar killer, the Core i7-3960X, the fastest desktop CPU ever.

This selection isn’t perfect, but we think it provides a good cross-section of the market. Face it: the CPU makers offer way too many models these days. The sheer volume of parts is difficult to track without an online reference. If you’re having trouble keeping them sorted, fear not. We’ve broken down the results by class in the following pages, and we’ll summarize the overall picture with one of our famous price-performance scatter plots.

Our testing methods

Our test systems were configured to create as equal a playing field as possible for the CPUs. They all shared the same software, graphics cards, storage, and memory types. Here’s a look at one of the test rigs, mounted in a swanky open-air case.

The system configurations we used were:

Processor Phenom II X4 850 Phenom II X4 980 Phenom II X6 1100T AMD FX-4170 AMD FX-6200 AMD

FX-8150

Core

i5-2400 Core i5-2500K Core

i7-2600K Core i5-3470 Core i5-3570K Core i7-3770K Core

i7-3960X Core i7-3820 AMD

A8-3850 Core

i5-655K Core i5-760 Core i7-875K Motherboard Asus

Crosshair V Formula MSI

Z77A-GD65 Intel

DX79SI Gigabyte

A75M-UD2H Asus P7P55D-E Pro North bridge 990FX Z77

Express X79

Express A75

FCH P55

PCH South bridge SB950 Memory size 8 GB (2 DIMMs) 8 GB (2 DIMMs) 16 GB

(4 DIMMs) 8 GB

(2 DIMMs) 8 GB

(2 DIMMs) Memory type AMD

Entertainment Edition DDR3 SDRAM Corsair Vengeance DDR3 SDRAM Corsair Vengeance DDR3 SDRAM Corsair Vengeance DDR3 SDRAM Corsair Vengeance DDR3 SDRAM Memory speed 1600 MT/s 1600 MT/s 1600 MT/s 1600 MT/s 1333 MT/s Memory timings 9-9-9-24

1T 9-9-9-24

1T 9-9-9-24

1T 9-9-9-24

1T 8-8-8-20 1T Chipset drivers AMD

chipset 12.3 INF

update 9.3.0.1020 iRST 11.1.0.1006 INF

update 9.2.3.1022 RSTe 3.0.0.3020 AMD

chipset 12.3 INF

update 9.3.0.1020 iRST 11.1.0.1006 Audio Integrated SB950/ALC889 with Realtek 6.0.1.6602 drivers Integrated Z77/ALC898 with



Realtek 6.0.1.6602 drivers Integrated X79/ALC892 with Realtek 6.0.1.6602 drivers Integrated A75/ALC889 with Realtek 6.0.1.6602 drivers Integrated P55/VIA VT1828S with Microsoft drivers

They all shared the following common elements:

Hard drive Kingston

HyperX SH100S3B 120GB SSD Discrete graphics XFX

Radeon HD 7950 Double Dissipation 3GB with Catalyst 12.3 drivers OS Windows 7 Ultimate x64 Edition

Service Pack 1 (AMD systems only: KB2646060, KB2645594 hotfixes) Power supply Corsair

AX650

Thanks to Corsair, XFX, Kingston, MSI, Asus, Gigabyte, Intel, and AMD for helping to outfit our test rigs with some of the finest hardware available. Thanks to Intel and AMD for providing most of the processors, as well, of course.

We used the following test applications:

Some further notes on our testing methods:

We used the Fraps utility to record frame rates while playing either a 60- or 90-second sequence from the game. Although capturing frame rates while playing isn’t precisely repeatable, we tried to make each run as similar as possible to all of the others. We tested each Fraps sequence five times per processor in order to counteract any variability. We’ve included frame-by-frame results from Fraps for each game, and in those plots, you’re seeing the results from a single, representative pass through the test sequence.

The test systems’ Windows desktops were set at 1920×1080 in 32-bit color. Vertical refresh sync (vsync) was disabled in the graphics driver control panel.

After consulting with our readers, we’ve decided to enable Windows’ “Balanced” power profile for the bulk of our desktop processor tests, which means power-saving features like SpeedStep and Cool’n’Quiet are operating. (In the past, we only enabled these features for power consumption testing.) Our spot checks demonstrated to us that, typically, there’s no performance penalty for enabling these features on today’s CPUs. If there is a real-world penalty to enabling these features, well, we think that’s worthy of inclusion in our measurements, since the vast majority of desktop processors these days will spend their lives with these features enabled.

The tests and methods we employ are usually publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.

The Elder Scrolls V: Skyrim

We tested performance using Fraps while taking a stroll around the town of Whiterun in Skyrim. The game was set to the graphical quality settings shown above. Note that we’re using fairly high quality visual settings, basically the “ultra” presets at 1920×1080 but with FXAA instead of MSAA. Our test sessions lasted 90 seconds each, and we repeated them five times per CPU.





Frame time

in milliseconds FPS

rate 8.3 120 16.7 60 20 50 25 40 33.3 30 50 20

The plots above show the time required to render each frame from a single test run. You can click on the buttons to switch to results for different brands and classes of processors. Notice that, because we’re reporting frame times, lower numbers are preferable to higher ones. You can even see some spikes representing long frame times in each plot. For the confused, we’ve included the table on the right, which converts some key frame time thresholds into their FPS equivalents. Also note that faster solutions tend to produce more total frames than the slower ones during the test period.

To get a sense of the performance range we’re dealing with, flip between the far-left “AMD budget” and far right “Intel extreme” buttons a few times. The fastest Intel processors produce very few frame times above 16.7 milliseconds—that is, they churn out a nearly steady stream of frames at roughly 60 FPS or better. Meanwhile, the slowest budget processors see regular spikes into 30 or 40 millisecond territory. That’s not a devastating outcome, but there are substantially more slowdowns than we see with the fastest processors.

All of the CPUs achieve FPS averages near or above the supposedly golden 60 FPS mark. Based purely on common FPS expectations, one might argue that any of them should be more than sufficient for playing Skyrim well.

There are some warning signs here, though. The Intel processors line up as one might expect, roughly in order of age and then model number, with only the slowest legacy dual-core trailing anything from AMD. The three Ivy Bridge parts, in lighter blue, fare well. However, the AMD processors don’t quite behave as expected. A prior-gen CPU, the Phenom II X4 980, takes the top spot among them. The three FX processors, in lighter green, don’t finish in the order expected, either. The low-end FX-4170 is the fastest of the three, although by a slim margin.

In the past, we might have dismissed these results as part of the noise and as not terribly relevant given the fairly high FPS averages. Look what happens when we turn the focus to frame latencies, though.

This is a snapshot of the frame latency picture; it’s the point below which 99% of all frames have been rendered. We’re simply excluding the last 1% of frames, many of them potential outliers, to get a sense of overall smoothness.

Again, the Intel processors perform well. All but one of them render the great majority of frames in under 23 milliseconds, which translates to a steady-state frame rate of just under 50 FPS. There is some reshuffling in the move from FPS average to a latency-sensitive metric—the “big iron” Core i7-3820 with its large cache and quad memory channels moves up the ranks, for instance—but the changes are what one might expect, given the hardware in question.

Meanwhile, the AMD FX processors suffer in this comparison. The FX-8150, which is ostensibly AMD’s top-of-the-line desktop processor, trails two older Phenom IIs and the FX-4170. The FX-6200 falls behind the A8-3850, a budget APU based on AMD’s prior CPU microarchitecture. The absolute numbers aren’t stellar, either. The FX processors are cranking out 99% of the frames in 33 milliseconds or so, which translates to a steady rate of 30 FPS—much lower than even the slower Intel processors.

What’s the problem? The broader latency curve suggests some answers.





The “tail” of the curve for the AMD processors is telling. Although the FX chips keep pace with the Phenom II X6 1100T in the first 95% or so of frames rendered, their frame times spike upward to meet the slower A8-3850 budget APU and Phenom II X4 850 in the last ~5% of frames. In the most crucial function of gaming performance, latency avoidance, the more expensive FX processors essentially perform like low-end CPUs.

Why? I think the answer is suggested by the relatively strong performance of the FX-4170 compared to the FX-6200 and FX-8150. As we noted, the FX-4170 actually has the highest base and Turbo clock speeds of the FX lineup. That means it likely has the highest per-thread performance of any FX chip, and that appears to translate into better latency mitigation. (I also suspect the FX-4170 spends more time operating near its peak Turbo speed, since it only has to fit two “modules” and four cores into the same 125W thermal envelope as the higher-end FX chips.)

Looks to me like the FX CPUs have an Amdahl’s Law problem. Even though they have a relatively large amount of cores for their given product segments, their per-thread performance is fairly weak. The Bulldozer architecture combines relatively low instruction throughput per cycle with clock speeds that aren’t as high as AMD probably anticipated. That adds up to modest per-thread performance—and even with lots of cores on hand, the execution speed of a single thread can limit an application’s throughput.

Thus, the FX-6200 and FX-8150 processors aren’t as well-suited to our Skyrim test scenario as their predecessors in the Phenom II lineup. Only the FX-4170 outperforms the CPU it replaces, the Phenom II X4 850, whose lack of L3 cache and modest 3.3GHz clock frequency aren’t doing it any favors.

Do the results from our new methods mean that some AMD processors are inadequate for Skyrim? Not quite. One of our key metrics for frame latency problems involves adding up all of the time spent working on frames above a certain time threshold. We consider it a measure of “badness,” giving us a sense of how severe the slowdowns are. We typically start at a threshold of 50 milliseconds, which translates to 20 FPS, since taking longer than that to produce a frame is likely to interrupt the illusion of motion. The thing is, none of the CPUs we tested spends any real time above the 50 ms threshold. They’re all adequate enough to deliver relatively decent gameplay. In fact, we’ve omitted the graph for this threshold, since it doesn’t show much.

However, if we crank down the tolerance to 16.7 milliseconds, the equivalent of 60 FPS, then the differences become apparent. The FX processors again fare poorly, relatively speaking. If you covet glassy smoothness, where the system pumps out frames consistently at low latencies close to your display’s refresh rate, then you’ll want a newer Intel processor. In this scenario, no entry in the FX lineup comes as close to delivering that experience as a Phenom II X4 980 or a Core i5-655K.

Batman: Arkham City

Now that we’ve established our evil methods, we can deploy them against Batman. Again, we tested in 90-second sessions, this time while grappling and gliding across the rooftops of Gotham in a bit of Bat-parkour. Again, we’re using pretty decent image quality settings at two megapixels; we’re just avoiding this game’s rather pokey DirectX 11 mode.

In our test session, we’re moving rapidly through a big swath of the city, so the game engine has to stream in more detail periodically. You can see the impact in the frame time plots: every CPU shows occasional spikes throughout the test run.





The severity of the spikes is lessened by having a faster CPU, though. Once more, the contrast between plots exposed by the far-left and far-right buttons is instructive.

Although they are very different ways of counting, the FPS average and the 99th percentile frame time largely appear to agree here. One distinction worth making is that the latency-focused metric is a tougher judge. Although the FPS averages range up to almost 90 FPS, the 99th percentile frame times don’t reach down to 16.7 milliseconds, so none of the processors provide a near-steady stream of frames at 60 FPS.





The broader latency picture in this test scenario is a good one. That is, frame times remain nice and low up until the last few percentage points, and none of the processors show a “tail” that spikes upward suddenly before the others. Yes, the Intel CPUs are generally quicker, but the differences are fairly minor overall.

In this case, our measure of “badness” provides the real distinction between the faster and slower CPUs. None of the Intel CPUs from the Ivy or Sandy Bridge generations spends any substantial amount of time working on long-latency frames. Even the Core i5-760 avoids crossing the 50-ms threshold for long. The AMD processors, however, all spend at least a tenth of a second in that space—not long in the context of a 90-second test run, to be sure, but enough that one might feel a hitch here or there. We’re left to ponder the fact that the flagship FX-8150 doesn’t avoid slowdowns as well as a legacy Intel dual-core, ye olde Core i5-655K.

Crysis 2

Our test session in Crysis 2 was only 60 seconds long, mostly for the sake of ensuring a precisely repeatable sequence.





Notice the spike at the beginning of the test run; it happens on each and every CPU. You can feel the hitch while playing. Apparently the game is loading some data for the area we’re about to enter or something along those lines.

Here’s a closer look at the spike on a subset of the processors. The duration of the pause appears to be at least somewhat CPU dependent. The A8-3850 APU takes nearly a third of a second to complete the longest frame, while the Ivy-based 3770K needs less than half of that time.

The FPS averages and 99th percentile results nearly mirror each once again, and we appear to be running into a potential GPU bottleneck on the fastest CPUs, which are bunched together pretty closely in both metrics. Fortunately for AMD, the FX processors don’t seem to have any trouble outperforming their predecessors in this test scenario. All of them remain slower than the Intel chips from two generations back, though.





Whoa. Check out the tails in those Intel latency curves. Notice how the Core i5-2400’s tail spikes upward just a little before the 2500K’s, which spikes a little before the 2600K’s. The same pattern is evident for the three Ivy-based CPUs, too. I’m sorry, but that is awesome. Remember, I played through these test sessions manually, five per CPU, attempting but never quite succeeding to play exactly the same way. To see our data line up like by CPU speed grade is ridiculously gratifying.

What it tells is that there are measurable differences between the Intel CPUs’ performance in the last 5-7% of frames rendered. The faster processors do a better job of keeping frame latencies low—and thus gameplay smooth.

Some proportion of the frames in this scenario present difficulty for each of the CPUs, whether it’s the final ~3% on the fastest processors, the final ~15% on the Core i5-760, or the final ~35% on the FX-6200. The tails for the different chips vary in shape quite a bit, and if you look at the frame time plots above, you can see the intermittent spikes that represent those frames. The spikes are smaller and less frequent on the faster processors. To keep things in perspective, though, even the slowest AMD chips deliver 99% of their frames in under 33 milliseconds, or over 30 FPS.

When we focus directly on the severity of slowdowns, the two top FX processors again fall behind their Phenom II counterparts, although only by the slimmest of margins. Again, the AMD processors are up to the task of running this game, but they perform similarly to Intel’s older, low-end parts.

Meanwhile, we should point out a trend on the Intel side of the aisle, which is the ongoing strong performances in our latency-related metrics for the Ivy Bridge processors. Here, the relatively affordable Core i5-3470 wastes less time on long-latency frames than the $1K Core i7-3960X does. Yeah, we’re splitting eyelashes, but it’s true. The tweaked microarchitecture in Ivy Bridge counts for something, and I suspect the 22-nm chips also spend a little more time resident at their peak Turbo clock frequencies.

Battlefield 3

As with Crysis 2, our BF3 test sessions were 60-seconds long to keep them easily repeatable. We tested at BF3‘s high-quality presets, again at 1920×1080.

Click the buttons under each screenshot to toggle between the different solutions. You might have to wait a second or two for a new image to load after each click.





Yikes. Here’s an example where the commonly held belief about PC games and CPU performance looks to be correct. None of the processors appear to struggle much at all in delivering nice, low frame times throughout the test run.

Wow. Every processor down to the A8-3850 delivers 99% of all frames in 16.7 milliseconds or less. That adds up to a nearly uninterrupted stream of frames at 60 FPS.





Yes, we can still discern fine-grained differences between the CPUs with a really tight threshold, but there’s really very little “badness” to be sifted out. Also, in a ray of light for AMD, the FX-8150 performs relatively well here. This is one of those cases, though, when nearly any modern CPU will do.

Multitasking: Gaming while transcoding video

A number of readers over the years have suggested that some sort of real-time multitasking test would be a nice benchmark for multi-core CPUs. That goal has proven to be rather elusive, but we think our new game testing methods may allow us to pull it off. What we did is play some Skyrim, with a 60-second tour around Whiterun, using the same settings as our earlier gaming test. In the background, we had Windows Live Movie Maker transcoding a video from MPEG2 to H.264. Here’s a look at the quality of our Skyrim experience while encoding.





Several things happen when we add a background video encoding task to the mix. For one, the Core i7-3960X, with its six cores and 12 threads, reasserts its place at the top of the charts. Although Skyrim alone may not need all of its power, the 3960X better maintains low frame latencies when multitasking. The FX-8150’s additional cores come in handy here, as well, as it surpasses the lower-end FX parts. Unfortunately, the 8150 still can’t quite match two of the Phenom IIs that preceded it.





The 3960X’s latency curve is clearly differentiated from the 3770K’s here, while the dual-core i5-655K struggles mightily, falling behind all of the AMD processors, none of which have only two cores.

With most of these CPUs, you can play Skyrim and encode video in the background with relatively little penalty in terms of animation fluidity. We’ve dialed back our threshold to 50 ms, and as you can see, all of the newer Intel processors avoid serious slowdowns entirely. The AMD chips aren’t bad, either, overall. Somewhat surprisingly, the Phenom II X4 980 outperforms the X6 1100T, despite having two fewer cores, presumably thanks to its higher clock speed.