Hmm. Where to begin? Probably early last month, when we discovered some performance problems with the Radeon HD 7950 in recent games using our newfangled testing methods, which focus on frame rendering times rather than simple FPS averages. Eventually, AMD acknowledged the problem and pledged to address the issues of high-latency frames in a series of driver updates.

Happily, we didn’t have to wait long for the first update in that series. Within a day or two, AMD provided us a Catalyst 13.2 beta driver that includes fixes intended to improve frame rendering times in several of the DirectX 9 based games in our test suite: Skryim, Borderlands 2, and Guild Wars 2. Our report on this driver was delayed by a couple of factors, including our attendance at CES and an apparent incompatibility between this beta driver and our Sapphire 7950 card.

We still haven’t figured out the problem with the Sapphire card, but we ultimately switched to a different 7950, the MSI R7950 OC, which allowed us to test the new driver. The results on the following pages come from the MSI card. As you’ll see, its performance under the Catalyst 12.11 beta drivers is very similar to what we saw from the Sapphire, with the same latency profile and the same intermittent spikes caused by high-latency frames.

We have several interesting developments to discuss, including the nature of the changes AMD has made to the Cat 13.2 beta driver, but first, let’s take a look at our test results, which should help illustrate some of our points.

Since it’s been a while and one of the cards has changed, we’ll do a quick recap of our test configs before moving on.

Our testing methods

As ever, we did our best to deliver clean benchmark numbers. Our test systems were configured like so:

Processor Core i7-3820 Motherboard Gigabyte

X79-UD3 Chipset Intel X79

Express Memory size 16GB (4 DIMMs) Memory type Corsair

Vengeance CMZ16GX3M4X1600C9

DDR3 SDRAM at 1600MHz Memory timings 9-9-11-24

1T Chipset drivers INF update

9.3.0.1021 Rapid Storage Technology Enterprise 3.5.0.1101 Audio Integrated

X79/ALC898 with Realtek 6.0.1.6662 drivers Hard drive Corsair

F240 240GB SATA Power supply Corsair

AX850 OS Windows 8

Driver

revision GPU

base core clock (MHz) GPU

boost clock (MHz) Memory clock (MHz) Memory size (MB) Zotac

GTX 660 Ti AMP! GeForce 310.54 beta 1033 1111 1652 2048 MSI

R7950 OC Catalyst

12.11 beta 8 880 – 1250 3072 MSI

R7950 OC Catalyst

13.2 beta 880 – 1250 3072

Thanks to Intel, Corsair, and Gigabyte for helping to outfit our test rigs with some of the finest hardware available. AMD, Nvidia, and the makers of the various products supplied the graphics cards for testing, as well.

Unless otherwise specified, image quality settings for the graphics cards were left at the control panel defaults. Vertical refresh sync (vsync) was disabled for all tests.

In addition to the games, we used the following test applications:

We used the Fraps utility to record frame rates while playing either a 60- or 90-second sequence from the game. Although capturing frame rates while playing isn’t precisely repeatable, we tried to make each run as similar as possible to all of the others. We tested each Fraps sequence five times per video card in order to counteract any variability. We’ve included frame-by-frame results from Fraps for each game, and in those plots, you’re seeing the results from a single, representative pass through the test sequence.

The tests and methods we employ are generally publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.

The Elder Scrolls V: Skyrim

We’ll start with Skyrim since the outdoor area we tested proved to be particularly difficult for the Radeon, with a number of hiccups disrupting the flow of the animation. This test scenario was the subject of our slow-motion video comparison illustrating the problem. Below is a video showing the route we took during each test run.





Above are plots of the frame rendering times for each card throughout one of our five test runs. You can click on the buttons to switch between the Radeon HD 7950 with the two driver revisions and the GeForce GTX 660 Ti. (And yes, I have changed the look of our plots a bit. Some folks liked the idea of thicker lines, but I worry that they visually overstate the presence of latency spikes. Squint if you must, but I think this is a better way.)

You can see the difference between Cat 12.11 and 13.2 quite easily in these plots. With Cat 12.11, the Radeon’s frame times look more like a cloud than a line, and there are intermittent spikes to 50 milliseconds or more. Switch over to 13.2, and the line becomes much tighter, with less overall variance and only an occasional spike above 20 ms. The GTX 660 Ti’s line looks tighter still, but it also includes a handful of higher-latency frames.





We can zoom in on a small portion of the test run in order to get a closer look at those frame rendering times. You can see how the 7950’s frame times have grown more consistent—and, notably, the high-latency frames have been squelched—with the driver update.

The improvement here is easily perceptible while play-testing the two driver revs. The motion feels much smoother overall with Catalyst 13.2, and little things, like the plants swaying as you walk, become appreciably more fluid.

Interestingly enough, Cat 13.2’s improvements don’t move the FPS average even a single frame per second. Look at the frame time plots and you can see why: with both drivers, the Radeon HD 7950 produces about 4250 frames over the course of our 60-second test run. Thus, they both average out to the same number of frames produced per second. That fact may tell you all you need to know about the value of FPS averages.

For what it’s worth, the “minimum FPS” results that some benchmarks report aren’t much help, either, because they average frame times over one-second intervals, and that’s just too long a time window to capture important differences. In this test, for instance, the median FPS minimum from five runs with Cat 13.2 is 59 FPS. The same figure for Cat 12.11 is 58 FPS. Yet the slowdowns with Cat 12.11 are very real and perceptible.

Happily, we can capture the impact of the improvements in Cat 13.2 with our latency-focused metrics, including the 99th percentile frame time. This number is just the cutoff point below which 99% of all frames were rendered. The lower the number, the better overall frame rendering picture for the solution being tested. With the new driver, the Radeon HD 7950 comes very close to matching the GeForce GTX 660 Ti.

A look at the “tail” of the overall latency curve even better demonstrates the improvement with Catalyst 13.2. The new driver is quicker for the final 25% of the frames rendered, and it’s substantially better for the last 5-7% of frames that prove most time-consuming to render. As a result, the GeForce’s advantage in this test has essentially vanished.

Our final latency-sensitive metric tracks frames that take an especially long time to produce. The goal is to get a sense of “badness,” of the severity of any slowdowns encountered during the test session. We add up any time spent rendering beyond a threshold of 50 milliseconds. (Frame times of 50 ms are equivalent to a frame rate of 20 FPS, which is awfully slow.) For instance, if a frame takes 70 milliseconds to render, it will contribute 20 milliseconds to our “badness” index. The higher this index goes, the more time we’ve spent waiting on especially high-latency frames, and the less fluid the game animation has been.

With Cat 13.2, the Radeon HD 7950 delivers fluid animation throughout the course of our test scenario, with only a tiny 10-millisecond blip spent beyond our threshold. That outcome tracks well with our subjective sense that Skyrim smoothness has increased substantially.

Borderlands 2

As you’ll note, this session involves lots of fighting, so it’s not exactly repeatable from one test run to the next. However, we took the same path and fought the same basic contingent of foes each time through. The results were pretty consistent from one run to the next, and final numbers we’ve reported are the medians from five test runs.

We used the game’s highest image quality settings at the 27″ Korean monitor resolution of 2560×1440.





Again, the improvement from Cat 12.11 to Cat 13.2 is easily discernible in the raw frame time plots. Those quasi-regular frame time spikes with Cat 12.11 don’t mean Borderlands 2 is unplayable at these settings on the Radeon HD 7950. The spikes are generally no larger than 40 ms, so they’re not a huge hindrance to fluidity. Subjectively, however, those spikes contribute an unsettled feeling to the gameplay, a certain strangeness to the movement in this game. The new driver eliminates that pattern of quasi-regular spikes.

Although the change barely registers on the FPS average, Cat 13.2 fares better in our 99th percentile frame time metric.

The 7950 hasn’t quite caught the GTX 660 Ti overall, but it has improved greatly with the new beta driver, particularly in the last 5-7% of frames rendered.

Since the 7950 didn’t spend much time beyond our 50-ms “badness” threshold before, there’s not much improvement in this number with Catalyst 13.2.

Guild Wars 2

Guild Wars 2 has a snazzy new game engine that will stress even the latest graphics cards, and I think we can get reasonably reliable results if we’re careful. My test run consisted of a simple stroll through the countryside, which is fairly repeatable. I didn’t join any parties, fight any bandits, or try anything elaborate like that, as you can see in the video below.





The overall latency picture improves nicely from Cat 12.11 to Cat 13.2 once more. With the new driver, the 7950’s performance becomes incredibly similar to the GTX 660 Ti’s.

Amazingly, the FPS average for the Catalyst 13.2 is lower than for 12.11, even though the newer driver’s latency profile has obviously improved. That’s an unusual outcome; we’d generally expect latency-focused improvements to yield slight gains in FPS averages, as well. Given the choice, though, we’d take the more consistent frame times of the new driver over the higher FPS average of the older one.

Our 99th percentile frame time result puts things right: the 13.2 beta is clearly a better performer than 12.11, and the Radeon and the GeForce have become very evenly matched thanks to the updated driver.

The rest of our latency-oriented metrics agree. With Cat 13.2, the 7950 essentially ties the GeForce and provides smoother, more consistent frame rendering times.

What’s changed in Catalyst 13.2 beta—and what hasn’t

AMD appears to be making good on its promise to address frame latency issues via driver updates. Andrew Dodd, AMD’s Catalyst guru, tells us a variant of the Catalyst 13.2 beta driver we tested will be released via AMD’s website next week. That should be a nice first step toward shoring up Radeon performance compared to the competition

We asked Mr. Dodd whether the changes included in this beta driver would impact performance generally in DirectX 9 applications or only in the three DX9-based games we tested. We also inquired about whether the previously mentioned buffer size tweak for Borderlands 2 was included. Here’s his answer:

Basically the fix was different per application (for the DX9 applications) – each fix involved tweaking various driver parameters. In the case of Borderlands 2, yes it did involve tweaking the buffer size.

So what we have in Cat 13.2 is a series of targeted tweaks that appear to work quite well for the games in question. However, Dodd says additional improvements are coming down the pike, including a rewrite of the software memory manager for GPUs based on the Graphics Core Next architecture that should bring a more general improvement:

The driver does not yet contain the new video memory manager. Our intention is release a new driver in a few weeks, which does include the new Video memory manager, which will help resolve latency issues for DX11/DX10 applications.

We look forward to the updates and to the improved gaming experience that Radeon users should be able to enjoy as a result.

Frame latencies: a new frontier

One of the tougher questions we had for AMD, in the wake of our discovery of these latency issues and their subsequent move to fix them, was simply this: how can we know that we won’t see similar problems in the future? Dodd addressed this question directly in our correspondence, noting that AMD will be changing its testing procedures in the future in order to catch frame latency problems and prevent them:

Up until this point we had mostly assumed that there were occasional flickers in frame rate, but we had thought these were related to the fact that modern games mostly have streaming architectures and limitations of scheduling in the OS. We definitely will start regular measurements to ensure we track improvements, and stop regressions. Long term, we want to work with game developers and Microsoft to ensure these kinds of latency issues don’t keep cropping up.

That’s exactly the sort of answer we want to hear, and we’ll be watching and testing future Radeon drivers and GPUs in order to see how well AMD executes on that plan.

That answer also blows up one of the assumptions that we’ve held since we published our first Inside the second article. We’d assumed that, although we were among the first to conduct a frame-by-frame analysis of game performance in public, such analyses had been happening behind the scenes at the big GPU makers as a matter of course for a long time. Our interactions with AMD, Nvidia, and others in the industry have since changed our view.

In fact, at CES last week, I was discussing the latest developments with Nvidia’s Tom Petersen, and he told me there was one question I failed to ask in my investigation of Radeon-versus-GeForce frame latencies: why did Nvidia do so well? Turns out, he said, Nvidia has started engineering its drivers with an eye toward smooth and consistent frame rendering and delivery. I believe that effort began at some point during the Fermi generation of GPUs, so roughly two years ago, max. Clearly, that focus paid dividends in our comparison last month of the GTX 660 Ti and the Radeon HD 7950.

From what I’ve gathered, in the past, developers have used nifty tools like the one we used to dissect Crysis 2 tessellation weirdness. These tools can show you the time required by each stage of the process of rendering a frame. Reducing those time slices has often been the focus of optimization efforts. Meanwhile, the performance labs at GPU makers and elsewhere have largely focused on FPS-based benchmarks to provide a sense of overall comparative performance. It seems efforts to bridge the gap between these two domains, to look at the overall frame latency picture and to ensure consistency there, have only recently ramped up.

Of course, AMD’s participation is crucial to the success of such efforts. We look forward to seeing what sort of benefits the next round of Catalyst driver updates can provide—and to an ongoing conversation about how best to handle the complex collection of issues this new focus has unearthed.

You can measure my response times on Twitter.