With the release of the Radeon R9 290 and 290X, AMD upended the high-end graphics market by offering performance competitive with Nvidia’s existing products at substantially lower prices. The new Radeons didn’t just improve the value proposition, either. The R9 290X captured the overall GPU performance crown, wresting it away from the GeForce GTX 780 and Titan by the slimmest of margins. Such little differences are magnified in the world of high-end graphics, where the spoils—and sales—often go to the victor. After all, if you’re forking over something north of 500 bucks for a graphics card, bragging rights are probably involved to some extent.

You can imagine, then, how things went a bit pear-shaped when folks started reporting that Radeon R9 290X cards purchased at retail don’t seem to perform as well as the review units AMD supplied to the press.

Whoops. Sounds bad, doesn’t it? How can that be?

Well, from here, things get kind of complicated. Although the retail R9 290-series cards appear to have the same basic hardware and specifications as the review samples, the zillion-dollar question is what happens during everyday operation. You see, like the Turbo Boost mechanism in Intel CPUs, the Radeons’ PowerTune algorithm adjusts clock speed dynamically, from moment to moment, in response to current chip temperatures, the GPU workload, and the video card’s pre-defined power limits. For one reason or another, folks found that at least some retail R9 290-series cards seemed to operate at lower clock speeds than those initial review units.

AMD identified one apparent cause of the problem pretty quickly: the blowers on some retail cards weren’t spinning as fast as expected, and the reduced cooling capacity resulted in lower clock speeds. This explanation was quite plausible. Heck, we’d already seen how an increase in blower RPM can improve the R9 290X’s performance when we switched it into “uber” fan mode during our initial review. One can imagine that different blowers might not respond to increases in voltage quite the same way. If blower RPM were varying substantially from card to card, that might well explain the clock speed differences.

AMD soon issued a fix in the form of a software update. The Catalyst 13.11 beta 9v2 driver sought to equalize blower speeds from card to card by monitoring RPM directly, thus hopefully improving performance on retail cards that seemed to lag behind.

That change seemed sure to help, but as we discussed on our podcast, we had lingering questions. Had blower speeds increased generally, making the R9 290-series cards even louder? Because, you know, they were awfully darn loud before. More importantly, how much of the card-to-card variance remains, even with the new driver? I really wanted to know.

We had motive to test some R9 290X retail cards against our press samples, but we lacked the means. Although you may have heard stories about the glitzy lifestyles of semi-obscure hardware reviewers, the truth is that we can’t just order up several $549 graphics cards on a whim. Heck, these days, I can’t order lunch on a whim. Loading up a shopping cart at Newegg with 290-series Radeons wasn’t really an option.

Then something funny happened. We got a call from the folks at Nvidia offering to purchase a couple of retail R9 290X cards for us to test. The cards would be ordered from Newegg and shipped directly to Damage Labs for our scrutiny. The sample size wouldn’t be large, only two cards (with boxes still sealed) pulled at random from Newegg’s stock, but apparently the green team was confident enough in the likelihood of differences between our review samples and the retail cards to make the purchase. Since we were interested in exploring the question—and a little amused by the prospect of these fierce competitors buying one another’s products—we accepted the offer.

A couple of days later, we took delivery of two Radeon R9 290X cards: one from HIS and the other from Sapphire. Apart from the stickers on the cooling shrouds, the two look to be identical to one another and to our two R9 290X review samples. Almost immediately, I started some initial testing, to see if I could spot any obvious differences between the cards. Little did I know how much work lay ahead.

Our testing methods

Our test systems were configured like so:

Processor Core i7-3820 Motherboard Gigabyte

X79-UD3 Chipset Intel X79

Express Memory size 16GB (4 DIMMs) Memory type Corsair

Vengeance CMZ16GX3M4X1600C9

DDR3 SDRAM at 1600MHz Memory timings 9-9-9-24

1T Chipset drivers INF update

9.2.3.1023 Rapid Storage Technology Enterprise 3.6.0.1090 Audio Integrated

X79/ALC898 with Realtek 6.0.1.7071 drivers System

drive Corsair

F240 240GB SATA SSD Power supply Corsair

AX850 OS Windows

8.1

Driver

revision GPU

base core clock (MHz) GPU

boost clock (MHz) Memory clock (MHz) Memory size (MB) GeForce GTX

780 Ti GeForce 331.82 beta 876 928 1750 3072 Radeon

R9 290X sample 1 Catalyst

13.11 beta 8/9v2 – 1000 1250 4096 Radeon

R9 290X sample 2 Catalyst

13.11 beta 9v2 – 1000 1250 4096 HIS Radeon

R9 290X Catalyst

13.11 beta 8/9v2 – 1000 1250 4096 Sapphire

Radeon

R9 290X Catalyst

13.11 beta 9v2 – 1000 1250 4096

Thanks to Intel, Corsair, and Gigabyte for helping to outfit our test rigs with some of the finest hardware available. AMD, Nvidia, and the makers of the various products supplied the graphics cards for testing, as well.

Unless otherwise specified, image quality settings for the graphics cards were left at the control panel defaults. Vertical refresh sync (vsync) was disabled for all tests.

In addition to the games, we used the following test applications:

The tests and methods we employ are generally publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.

Our first test case: Skyrim

We knew up front that finding solid answers to our questions about card-to-card variance might be difficult. For one thing, the R9 290X is particularly sensitive to ambient temperatures. A warmer environment can produce lower clock speeds, and cooler ambient temps can lead to higher clocks. Unfortunately, Damage Labs isn’t set up for precise climate control. I’m lucky to have breathable oxygen most of the time. I did try to keep room temperatures from rising too high by cracking open a window when running a load test caused the room to heat up, but maintaining a perfectly steady environment just wasn’t realistically possible.

Our solution for that problem was to take lots of samples, especially for our main test case, in Skyrim. (We chose this game because it’s a reasonably taxing workload, which is also why we’ve used it for testing GPU power consumption in the past.) We monitored each card’s vitals for 30 minutes while running the game with our character standing still in a particular spot. We then tested each card three times, to see how much clock speeds varied from run to run. That’s 10.5 hours of testing at a minimum, just in Skyrim, to accommodate all of the configs we included. The actual testing time was much longer, since when we started, we were kind of clueless about the best way to proceed.

Our primary goal was to compare the performance of the retail and press sample 290X cards with the new Catalyst 13.11 beta 9v2 drivers, which attempt to equalize blower speeds. However, out of curiosity, we also decided to test our initial review unit and the HIS retail card with the older 13.11 beta 8 driver, which doesn’t equalize fan speeds, to see how much of a difference that software change makes.

We’ve plotted a number of variables from our test sessions below. You can click the buttons beneath the plots to see the results from each card. The plots come from one of the three test runs, while the bar charts show the median results from three runs. Also, note that the unit of time on the X axis in the plots is seconds. Somehow, I failed to include the units when making the graphs. Too much sitting around in white fan noise has dulled my wits, apparently.

The card labeled “290X sample 1” is the review sample from AMD that we used in our R9 290X review. Sample 2 came to us a couple of weeks later, also from AMD, after we requested a second card for CrossFire testing. The HIS and Sapphire cards are the retail units. I’ve also included numbers from our GeForce GTX 780 Ti review sample for comparison.





You can see several things in the plots. Each 290X card starts out at a solid 1GHz. Then, its clock frequency drops and bounces around as the GPU reaches its temperature limits and PowerTune starts working to balance temperature, power draw, and operating frequency. The amount of time before the clock throttling begins varies, depending mostly on how warm the GPU was before we fired up Skyrim. To account for this variability, we clipped off the first five minutes (300 seconds) of the test period when calculating the average clock speed (or fan speed or FPS) for each run.

Click over to the HIS 290X’s results, and the newer Catalyst driver that modified blower speeds has clearly paid off. The HIS card clearly runs slower with the older Catalyst 13.11b8 driver. By contrast, our initial review unit is largely unaffected by the change. You can also see that sample 1’s clock speeds appear to be somewhat higher than the other cards’. If we plot the median clocks across three runs, here’s how things line up:

That’s a relatively large amount of variability across four copies of the same card, especially since we’re running a popular game. Skyrim isn’t a peak workload in terms of power consumption or thermals. You can see that the initial review unit is the fastest of the bunch, regardless of which driver we use, while the HIS card takes up the rear. The newer driver does help the HIS card make up some ground, but it still trails sample 1 by quite a bit.

The graph above is just a summary, though. This table will give you a sense of the clock speed variability from run to run, as the temperature in Damage Labs fluctuated.

Clock

speed (MHz) Run

1 Run

2 Run

3 Median Sapphire

R9 290X 891 924 902 902 HIS R9

290X 882 893 889 889 HIS R9

290X – 13.11b8 845 832 835 835 290X

sample 1 928 952 930 930 290X

sample 1 – 13.11b8 947 958 957 957 290X

sample 2 913 912 903 912 GTX 780

Ti 1005 1004 1005 1005

Clock frequencies for the individual cards varied by as much as 22MHz during our Skyrim tests. Ideally, we’d impose stricter temperature controls and do even more testing. Heck, in an ideal scenario, we’d have a much larger selection of 290X cards to test, and I’d be conducting those tests from the temperature-controlled lower deck of my enormous luxury yacht anchored off the Yucatan coast, aided by a team of cheerleader interns. Sadly, that’s not the case. Still, I think the trend for each card begins to become clear after several runs—and we have more data to review.





The new Catalyst drivers raise the HIS card’s blower speed by over 300 RPM. Obviously, they’re addressing a very real problem. Those drivers also raise the blower RPM slightly on sample 1, our first review unit. That means the 290X will be a little louder overall than our initial review indicated. (All of these tests were conducted in the 290X’s default fan speed profile, not in “uber” mode.)

Notice the slight saw-tooth pattern on the plots for each of the 290X cards when used with the new drivers. The plots are much flatter with the 13.11 beta 8 drivers, which means the sound coming from the blower should be smoother and less variable, not quite as easy for the ears to notice. Presumably, when those little spikes happen with the new drivers, the GPU is reaching a thermal limit and needs additional help. The ensuing ramp up is abrupt, although the ramp down is more linear. Contrast that to the fan speed curve for the GTX 780 Ti, which is smoother than an Nvidia marketing pitch.

Also, in the “stuff you didn’t expect” department, notice that the blower RPM for the GeForce GTX 780 Ti is higher than for any of the 290X cards, even though the 780 Ti is much quieter under load than the R9 290X. Nvidia’s blower appears to have a slightly smaller diameter, but I’m impressed that it runs at substantially higher RPM and produces less noise.





Each plot starts with a flat line for the first couple of minutes, and then the FPS numbers start varying up and down at regular intervals. Why? Because after your charcacter stands still in Skyrim for a while, the game switches to an “attract mode” where the camera pans around him in a circle. The variance you see in the FPS plots is caused by that constantly changing perspective.

The clock speed variance we saw above translates into performance differences fairly predictably. The total gap in terms of FPS is fairly small, since we’re testing at 4K resolutions where Skyrim doesn’t run at really high frame rates. There is a 10% gap, though, between the HIS 290X and sample 1 with the 13.11 beta 8 drivers. With the newer drivers, where blower speeds are more even, the FPS gap narrows to about 5%. We might see larger differences at lower resolutions, where GPU speeds are likely to play a larger role than they do at 3840×2160. (At 4K, memory bandwidth has gotta be a big constraint.)

Thing is, small differences in performance can cost you a lot of money at the very top end of the graphics card market, with bragging rights on the line. The price difference between the Radeon R9 290 and 290X is $150, and the two were 3% apart in our overall performance assessment. That example’s a bit extreme, but we saw a 12% difference between the GeForce GTX 780 and the GTX 780 Ti. The price difference between those cards is $200.

The worst-case scenario: MSI Kombustor

In addition to a real-world game workload, I though it might be interesting to try something more extreme. MSI’s Kombustor utility is based on the infamous Furmark tool that the GPU companies have taken to calling a “power virus.” Early versions of PowerTune were created in part to deal with the programs like this one, which tend to push GPUs to their power and thermal limits.

Since Kombustor heats up a GPU pretty quickly, I decided to shorten our test sessions to 20 minutes each. As before, we then disregarded the first five minutes as a warm-up period. We also used only two test runs per config here, to keep our total testing time in check.





Interesting. For the most part, the 290X cards settle in at 727MHz, which appears to be the 290X’s undocumented base clock, and range up or down from there only intermittently. There is some variance among them. The Sapphire, for instance, drops as low as 400MHz at one point. By contrast, our first review sample somehow manages to avoid dropping down to the 727MHz baseline. As a result, sample 1 maintains a much higher average frequency than the rest of the bunch.

Oh, I should mention something. I logged GPU temperatures throughout all of these tests. The reason you don’t see them plotted here is simple: they’re boring, flat lines at 94-95°C for each 290X card. PowerTune is quite effective at maintaining its target temperature, as is Nvidia’s GPU Boost. The other variables are the ones that fluctuate.





Hmm. Even in this extreme thermal workload, both of the press cards’ fan curves are relatively flat, with only an occasional bump above the roughly 2200-RPM target. The HIS and Sapphire cards both make sudden forays into higher-RPM territory, only to decay slowly and then repeat the process. The retail cards will surely be louder as a consequence of their higher blower RPM.

This divergent behavior suggests there’s some sort of difference between the review units and the retail cards. Either the fan control algorithms are somehow working differently, or the chips on the retail cards simply require more voltage (and thus generate more heat) in order to operate at similar clock speeds.

The most striking difference in this context, of course, is the performance of our first 290X review sample. That card’s average frequencies are as much as 128MHz faster, yet its RPM curve remains lower and flatter than the retail units’.

Crysis 3

To round out my testing, I decided to try a couple more games, to see if the clock speed differences we’d observed would persist across other workloads. I was only able to conduct a single, 25-minute test session for each config, but I’m hoping the additional data proves enlightening, even if it isn’t as solid as our triple-run Skyrim results.

As in our other game tests, we simply had our character stand still and look at a (mostly) static scene. This game doesn’t go into any sort of attract mode, so the workload doesn’t vary much at all over time.













Our initial review sample continues to outperform the rest of the 290X cards in Crysis 3, as we’ve seen elsewhere. The difference is larger in terms of clock speeds than in frame rates, again probably because we’re testing at a massive 4K display resolution.

Yes, I should have tested at a lower resolution. We might well see more separation between the 290X cards if I had. I fail at media sensationalism yet again.

Battlefield 4

I tested BF4 much like Crysis 3, over a 25-minute period while viewing a static scene with relatively high quality levels at a 4K resolution.













The basic trend holds, with review sample 1 outperforming the other 290X cards. Again, the frame rate differences are pretty small in absolute terms, but nobody wants to play BF4 at 36 FPS, anyhow. The bigger problem is the clock speed difference of just over 5% between sample 1 and the two retail cards. That gap will translate into larger FPS deltas in other scenarios.

A firmware solution?

We were about to wrap up our work on this article when we became aware of another variable that might warrant some attention. In fact, the folks at AMD pointed out this issue, since they’re currently puzzling over it themselves.

Nate over at Legit Reviews has been looking into this same set of problems, and he found that firmware differences between the press and retail cards might be playing a role. Like us, he measured clear differences between the performance of his 290X review sample and some retail cards. He then extracted the firmware from his 290X review unit, flashed it to a retail 290X, and tested again. Turns out the retail 290X performed better when using the press sample’s firmware.

Why is that? We don’t know, neither does Nate, and AMD hasn’t answered our repeated inquiries about what the cause might be. At AMD’s request, though, we captured the firmware from 290X sample 1 and flashed it to our HIS retail card. We then ran this card through our triple-session Skyrim test to see how it fared.

Clock

speed (MHz) Run

1 Run

2 Run

3 Median HIS R9

290X – retail firmware 882 893 889 889 HIS R9

290X – sample 1 firmware 908 907 912 908 290X

sample 1 928 952 930 930

Hmm. With the firmware change, the HIS card’s clock speeds look to be up by about 20MHz in this test scenario, but they’re still about 20MHz lower than the clocks of 290X sample 1. Could the change be due to some difference in cooler RPM?

Not that I can tell. Heck, the HIS card could simply be faster because of variances in ambient temperature, although frankly, I doubt it. I let the room heat up a bit during the final test run with the press sample’s firmware, and the HIS card was still faster than in our first round of tests.

For what it’s worth, the alternative firmware didn’t alter the HIS 290X’s performance much at all. The card averaged about 76 FPS with either firmware revision.

The HIS card did seem to be a little unstable with the press firmware, though. Our test rig locked up several times during Skyrim testing. Could it be that the press sample firmware applies a lower GPU voltage over time? Slightly lower GPU voltages would explain both the higher clock speeds—due to added thermal headroom—and the instability, if the Hawaii GPU on the HIS card isn’t quite up to the task of stable operation at those voltage levels.

To find out, I dug into the GPU-Z logs once more.

Average

VDDC (mV) Run

1 Run

2 Run

3 Median HIS R9

290X – retail firmware 1097 1101 1105 1101 HIS R9

290X – sample 1 firmware 1092 1092 1096 1092 290X

sample 1 1069 1061 1070 1069 VDDC (mV) Median 90th percentile 99th percentile Peak HIS R9

290X retail firmware 1109 1141 1156 1164 HIS R9

290X sample 1 firmware 1094 1125 1148 1164 290X sample 1 1070 1125 1148 1172

Turns out the press sample video BIOS runs the HIS card’s GPU at about 10-20 fewer mV than the retail firmware. The GPU in the press sample 1 card is obviously a higher quality piece of silicon; it runs at higher frequencies with lower average and median voltages without instability.

What should we make of this seemingly minor voltage delta? Honestly, I don’t know. PowerTune is a dynamic algorithm, and it will supply more voltage in order to reach higher clocks if the thermal headroom allows. This 10-20 mV variance could be caused by ambient temperature differences rather than firmware changes. Still, the fact that the HIS card isn’t quite stable with the firmware from the press sample makes me wonder.

So now what?

We’ve learned a few things definitively in our testing, I think. First of all, AMD’s software fix to equalize blower speeds in the Catalyst 13.11 beta 9v2 driver release definitely improves the worst of the low-clock-speed problems that some R9 290X owners observed. The fix appears to raise fan speeds overall for 290X cards, slightly for our initial review unit and more dramatically for our HIS retail card.

Beyond that, I think we’ve collected enough data to say with confidence that our initial R9 290X review unit, sample 1, is superior to the two retail cards we tested, regardless of the driver or firmware revision. Even with the blower speed fix in place, our first review unit runs at 5-10% higher clock speeds than the retail cards, depending on the workload. That deficit translates into a 5-10% advantage in frame rates, though usually toward the lower end of that range at 4K resolutions. Sample 1 appears to achieve these clock speeds at lower voltages than the retail cards, too.

Furthermore, the two retail 290X cards exhibit higher fan speeds in our peak thermal workload, MSI Kombustor. They make intermittent, abrupt forays above the 2200-RPM limit imposed by the Catalyst 13.11 beta 9v2 driver, while our two press samples stick much closer to the 2200-RPM cap.

Do any of these findings really matter to current or prospective 290X owners? The short answer is, if you’re concerned about performance, they only matter by 10% or so at most. You can decide what to make of that fact. I’m sure some happy 290X owners won’t really mind as they’re gleefully slicing through opponents in Battlefield 4. Good for them.

Personally, I think our results matter in a few specific ways. I’ve already mentioned that the 290X’s fairly generous card-to-card variance isn’t a good fit for the realm of high-end video cards, where performance differences of less than 10% can command a premium of $150 or more. Bragging rights aren’t cheap, folks.

More notably, our review of the Radeon R9 290X likely overstated the product’s performance—and understated its noise levels—compared to the average card shipping to consumers today. Evidently, AMD chose to include some of its very best Hawaii GPUs aboard the review samples it supplied to the press. We’re not the only publication to notice this fact. A number of other media outlets have looked at this issue and found that their review units outperform retail 290-series cards, as well. Once our findings were clear, we contacted AMD and asked them to comment on this matter; it seemed proper to give them a shot at explaining themselves. Unfortunately, we still don’t have a statement or any convincing explanation for what happened.

The 290X’s relatively broad card-to-card variance stems from the decisions AMD made when defining this product. This new version of PowerTune is the first major dynamic voltage and frequency scaling (DVFS) scheme without an advertised base clock. AMD probably should have given the 290X a guaranteed base frequency somewhere north of 727MHz and chosen a more conservative peak clock speed, as well. Doing so could have resulted in less performance variance from one card to the next. That formula might have made the 290X more difficult to produce at high volumes—and it very likely wouldn’t have allowed hand-picked 290X review units to snatch the overall performance crown away from the GeForce GTX 780 and Titan on launch day. It would have had the great virtue, however, of setting more honest expectations.

I should mention that, between the cooler RPM tweak and the firmware questions it has raised, AMD still appears to be refining its 290-series graphics cards in fundamental ways that affect their operating speeds and noise levels. That’s both good, because some fixes may be forthcoming, and a little odd, since this sort of engineering work is generally supposed to take place before a product finds its way to consumers.

We still have some practical questions to answer about the gap between our experience with the initial review sample and the current reality with retail products. Our testing here has focused on primary variables like clock frequencies and blower RPM, but we need to measure the practical effects. How does moment-to-moment clock speed variance in retail 290X cards affect individual frame rendering times? We’ve only tested FPS so far, which doesn’t tell you what happens, ahem, inside each second. We also haven’t yet measured noise levels with the new drivers. We’ll have to find these things out in future testing, which we intend to conduct with retail cards and firmware whenever possible.

Update 12/5/13: AMD has issued a statement on these matters.

My Twitter posts vary widely in word count.