We estimate that in recent years, GPU prices have fallen at rates that would yield an order of magnitude over roughly:

17 years for single-precision FLOPS

10 years for half-precision FLOPS

5 years for half-precision fused multiply-add FLOPS

Details

GPUs (graphics processing units) are specialized electronic circuits originally used for computer graphics. In recent years, they have been popularly used for machine learning applications. One measure of GPU performance is FLOPS, the number of operations on floating-point numbers a GPU can perform in a second. This page looks at the trends in GPU price / FLOPS of theoretical peak performance over the past 13 years. It does not include the cost of operating the GPUs, and it does not consider GPUs rented through cloud computing.

Theoretical peak performance

‘Theoretical peak performance’ numbers appear to be determined by adding together the theoretical performances of the processing components of the GPU, which are calculated by multiplying the clock speed of the component by the number of instructions it can perform per cycle. These numbers are given by the developer and may not reflect actual performance on a given application.

Metrics

We collected data on multiple slightly different measures of GPU price and FLOPS performance.

Price metrics

GPU prices are divided into release prices, which reflect the manufacturer suggested retail prices that GPUs are originally sold at, and active prices, which are the prices at which GPUs are actually sold at over time, often by resellers.

We expect that active prices better represent prices available to hardware users, but collect release prices also, as supporting evidence.

FLOPS performance metrics

Several varieties of ‘FLOPS’ can be distinguished based on the specifics of the operations they involve. Here we are interested in single-precision FLOPS, half-precision FLOPS, and half-precision fused-multiply add FLOPS.

‘Single-precision’ and ‘half-precision’ refer to the number of bits used to specify a floating point number. Using more bits to specify a number achieves greater precision at the cost of more computational steps per calculation. Our data suggests that GPUs have largely been improving in single-precision performance in recent decades, and half-precision performance appears to be increasingly popular because it is adequate for deep learning.

Nvidia, the main provider of chips for machine learning applications, recently released a series of GPUs featuring Tensor Cores, which claim to deliver “groundbreaking AI performance”. Tensor Core performance is measured in FLOPS, but they perform exclusively certain kinds of floating-point operations known as fused multiply-adds (FMAs). Performance on these operations is important for certain kinds of deep learning performance, so we track ‘GPU price / FMA FLOPS’ as well as ‘GPU price / FLOPS’.



In addition to purely half-precision computations, Tensor Cores are capable of performing mixed-precision computations, where part of the computation is done in half-precision and part in single-precision. Since explicitly mixed-precision-optimized hardware is quite recent, we don’t look at the trend in mixed-precision price performance, and only look at the trend in half-precision price performance.

Precision tradeoffs

Any GPU that performs multiple kinds of computations (single-precision, half-precision, half-precision fused multiply add) trades off performance on one for performance on the other, because there is limited space on the chip, and transistors must be allocated to either one type of computation or the other. All current GPUs that perform half-precision or TensorCore fused-multiply-add computations also do single-precision computations, so they are splitting their transistor budget. For this reason, our impression is that half-precision FLOPS could be much cheaper now if entire GPUs were allocated to each one alone, rather than split between them.

We collected data on theoretical peak performance (FLOPS), release date, and price from several sources, including Wikipedia. (Data is available in this spreadsheet). We found GPUs by looking at Wikipedia’s existing large lists and by Googling “popular GPUs” and “popular deep learning GPUs”. We included any hardware that was labeled as a ‘GPU’. We adjusted prices for inflation based on the consumer price index.

We were unable to find price and performance data for many popular GPUs and suspect that we are missing many from our list. In our search, we did not find any GPUs that beat our 2017 minimum of $0.03 (release price) / single-precision GFLOPS. We put out a $20 bounty on a popular Facebook group to find a cheaper GPU / FLOPS, and the bounty went unclaimed, so we are reasonably confident in this minimum.

GPU price / single-precision FLOPS

Figure 1 shows our collected dataset for GPU price / single-precision FLOPS over time. Figure 1: Real GPU price / single-precision FLOPS over time. The vertical axis is log-scale. Price is measured in 2019 dollars.

To find a clear trend for the prices of the cheapest GPUs / FLOPS, we looked at the running minimum prices every 10 days. Figure 2: Ten-day minimums in real GPU price / single-precision FLOPS over time. The vertical axis is log-scale. Price is measured in 2019 dollars. The blue line shows the trendline ignoring data before late 2007. (We believe the apparent steep decline prior to late 2007 is an artefact of a lack of data for that time period.)

The cheapest GPU price / FLOPS hardware using release date pricing has not decreased since 2017. However there was a similar period of stagnation between early 2009 and 2011, so this may not represent a slowing of the trend in the long run.

Based on the figures above, the running minimums seem to follow a roughly exponential trend. If we do not include the initial point in 2007, (which we suspect is not in fact the cheapest hardware at the time), we get that the cheapest GPU price / single-precision FLOPS fell by around 17% per year, for a factor of ten in ~12.5 years.

GPU price / half-precision FLOPS

Figure 3 shows GPU price / half-precision FLOPS for all the GPUs in our search above for which we could find half-precision theoretical performance. Figure 3: Real GPU price / half-precision FLOPS over time. The vertical axis is log-scale. Price is measured in 2019 dollars.

Again, we looked at the running minimums of this graph every 10 days, shown in Figure 4 below. Figure 4: Minimums in real GPU price / half-precision FLOPS over time. The vertical axis is log-scale. Price is measured in 2019 dollars.

If we assume an exponential trend with noise, cheapest GPU price / half-precision FLOPS fell by around 26% per year, which would yield a factor of ten after ~8 years.

GPU price / half-precision FMA FLOPS

Figure 5 shows GPU price / half-precision FMA FLOPS for all the GPUs in our search above for which we could find half-precision FMA theoretical performance. (Note that this includes all of our half-precision data above, since those FLOPS could be used for fused-multiply adds in particular). GPUs with TensorCores are marked in red.

Figure 5: Real GPU price / half-precision FMA FLOPS over time. Price is measured in 2019 dollars.

Figure 6 shows the running minimums of GPU price / HP FMA FLOPS. Figure 6: Minimums in real GPU price / half-precision FMA FLOPS over time. Price is measured in 2019 dollars.

GPU price / Half-Precision FMA FLOPS appears to be following an exponential trend over the last four years, falling by around 46% per year, for a factor of ten in ~4 years.

Active Prices

GPU prices often go down from the time of release, and some popular GPUs are older ones that have gone down in price. Given this, it makes sense to look at active price data for the same GPU over time.

Data Sources

We collected data on peak theoretical performance in FLOPS from TechPowerUp and combined it with active GPU price data to get GPU price / FLOPS over time. Our primary source of historical pricing data was Passmark, though we also found a less trustworthy dataset on Kaggle which we used to check our analysis. We adjusted prices for inflation based on the consumer price index.

Passmark

We scraped pricing data on GPUs between 2011 and early 2020 from Passmark. Where necessary, we renamed GPUs from Passmark to be consistent with TechPowerUp. The Passmark data consists of 38,138 price points for 352 GPUs. We guess that these represent most popular GPUs.



Looking at the ‘current prices’ listed on individual Passmark GPU pages, prices appear to be sourced from Amazon, Newegg, and Ebay. Passmark’s listed pricing data does not correspond to regular intervals. We don’t know if prices were pulled at irregular intervals, or if Passmark pulls prices regularly and then only lists major changes as price points. When we see a price point, we treat it as though the GPU is that price only at that time point, not indefinitely into the future.



The data contains several blips where a GPU is briefly sold very unusually cheaply. A random checking of some of these suggests to us that these correspond to single or small numbers of GPUs for sale, which we are not interested in tracking, because we are trying to predict AI progress, which presumably isn’t influenced by temporary discounts on tiny batches of GPUs.



Kaggle

This Kaggle dataset contains scraped data of GPU prices from price comparison sites PriceSpy.co.uk, PCPartPicker.com, Geizhals.eu from the years 2013 – 2018. The Kaggle dataset has 319,147 price points for 284 GPUs. Unfortunately, at least some of the data is clearly wrong, potentially because price comparison sites include pricing data from untrustworthy merchants. As such, we don’t use the Kaggle data directly in our analysis, but do use it as a check on our Passmark data. The data that we get from Passmark roughly appears to be a subset of the Kaggle data from 2013 – 2018, which is what we would expect if the price comparison engines picked up prices from the merchants Passmark looks at.

Limitations

There are a number of reasons why we think this analysis may in fact not reflect GPU price trends:

We effectively have just one source of pricing data, Passmark.

Passmark appears to only look at Amazon, Newegg, and Ebay for pricing data.

We are not sure, but we suspect that Passmark only looks at the U.S. versions of Amazon, Newegg, and Ebay, and pricing may be significantly different in other parts of the world (though we guess it wouldn’t be different enough to change the general trend much).

As mentioned above, we are not sure if Passmark pulls price data regularly and only lists major price changes, or pulls price data irregularly. If the former is true, our data may be overrepresenting periods where the price changes dramatically.

None of the price data we found includes quantities of GPUs which were available at that price, which means some prices may be for only a very limited number of GPUs.

We don’t know how much the prices from these datasets reflect the prices that a company pays when buying GPUs in bulk, which we may be more interested in tracking.

A better version of this analysis might start with more complete data from price comparison engines (along the lines of the Kaggle dataset) and then filter out clearly erroneous pricing information in some principled way.

Data

The original scraped datasets with cards renamed to match TechPowerUp can be found here. GPU price / FLOPS data is graphed on a log scale in the figures below. Price points for the same GPU are marked in the same color. We adjusted prices for inflation using the consumer price index. All points below are in 2019 dollars.

To try to filter out noisy prices that didn’t last or were only available in small numbers, we took out the lowest 5% of data in every several day period to get the 95th percentile cheapest hardware. We then found linear and exponential trendlines of best fit through the available hardware with the lowest GPU price / FLOPS every several days.

GPU price / single-precision FLOPS

Figures 7-10 show the raw data, 95th percentile data, and trendlines for single-precision GPU price / FLOPS for the Passmark dataset. This folder contains plots of all our datasets, including the Kaggle dataset and combined Passmark + Kaggle dataset.

Figure 7: GPU price / single-precision FLOPS over time, taken from our Passmark dataset. Price is measured in 2019 dollars. This picture shows that the Kaggle data does appear to be a superset of the Passmark data from 2013 – 2018, giving us some evidence that the Passmark data is correct. The vertical axis is log-scale.



Figure 8: The top 95% of data every 10 days for GPU price / single-precision FLOPS over time, taken from the Passmark dataset we plotted above. (Figure 7 with the cheapest 5% removed.) The vertical axis is log-scale.



Figure 9: The same data as Figure 8, with the vertical axis zoomed-in.

Figure 10: The minimum data points from the top 95% of the Passmark dataset, taken every 10 days. We fit linear and exponential trendlines through the data. The vertical axis is log-scale.

Analysis

The cheapest 95th percentile data every 10 days appears to fit relatively well to both a linear and exponential trendline. However we assume that progress will follow an exponential, because previous progress has followed an exponential.



In the Passmark dataset, the exponential trendline suggested that from 2011 to 2020, 95th-percentile GPU price / single-precision FLOPS fell by around 13% per year, for a factor of ten in ~17 years, bootstrap 95% confidence interval 16.3 to 18.1 years. We believe the rise in price / FLOPS in 2017 corresponds to a rise in GPU prices due to increased demand from cryptocurrency miners. If we instead look at the trend from 2011 through 2016, before the cryptocurrency rise, we instead get that 95th-percentile GPU price / single-precision FLOPS price fell by around 13% per year, for a factor of ten in ~16 years.

This is slower than the order of magnitude every ~12.5 years we found when looking at release prices. If we restrict the release price data to 2011 – 2019, we get an order of magnitude decrease every ~13.5 years instead, so part of the discrepancy can be explained because of the different start times of the datasets. To get some assurance that our active price data wasn’t erroneous, we spot checked the best active price at the start of 2011, which was somewhat lower than the best release price at the same time, and confirmed that its given price was consistent with surrounding pricing data. We think active prices are likely to be closer to the prices at which people actually bought GPUs, so we guess that ~17 years / order of magnitude decrease is a more accurate estimate of the trend we care about.

GPU price / half-precision FLOPS

Figures 11-14 show the raw data, 95th percentile data, and trendlines for half-precision GPU price / FLOPS for the Passmark dataset. This folder contains plots of the Kaggle dataset and combined Passmark + Kaggle dataset.



Figure 11: GPU price / half-precision FLOPS over time, taken from our Passmark dataset. Price is measured in 2019 dollars. This picture shows that the Kaggle data does appear to be a superset of the Passmark data from 2013 – 2018, giving us some evidence that the Passmark data is reasonable. The vertical axis is log-scale.



Figure 12: The top 95% of data every 30 days for GPU price / half-precision FLOPS over time, taken from the Passmark dataset we plotted above. (Figure 11 with the cheapest 5% removed.) The vertical axis is log-scale.



Figure 13: The same data as Figure 12, with the vertical axis zoomed-in.

Figure 14: The minimum data points from the top 95% of the Passmark dataset, taken every 30 days. We fit linear and exponential trendlines through the data. The vertical axis is log-scale.

Analysis

If we assume the trend is exponential, the Passmark trend seems to suggest that from 2015 to 2020, 95th-percentile GPU price / half-precision FLOPS of GPUs has fallen by around 21% per year, for a factor of ten over ~10 years, bootstrap 95% confidence interval 8.8 to 11 years. This is fairly close to the ~8 years / order of magnitude decrease we found when looking at release price data, but we treat active prices as a more accurate estimate of the actual prices at which people bought GPUs. As in our previous dataset, there is a noticeable rise in 2017, which we think is due to GPU prices increasing as a result of cryptocurrency miners. If we look at the trend from 2015 through 2016, before this rise, we get that 95th-percentile GPU price / half-precision FLOPS has fallen by around 14% per year, which would yield a factor of ten over ~8 years.

GPU price / half-precision FMA FLOPS

Figures 15-18 show the raw data, 95th percentile data, and trendlines for half-precision GPU price / FMA FLOPS for the Passmark dataset. GPUs with Tensor Cores are marked in black. This folder contains plots of the Kaggle dataset and combined Passmark + Kaggle dataset.



Figure 15: GPU price / half-precision FMA FLOPS over time, taken from our Passmark dataset. price is measured in 2019 dollars. This picture shows that the Kaggle data does appear to be a superset of the Passmark data from 2013 – 2018, giving us some evidence that the Passmark data is correct. The vertical axis is log-scale.



Figure 16: The top 95% of data every 30 days for GPU price / half-precision FMA FLOPS over time, taken from the Passmark dataset we plotted above. (Figure 15 with the cheapest 5% removed.)



Figure 17: The same data as Figure 16, with the vertical axis zoomed-in.

Figure 18: The minimum data points from the top 95% of the Passmark dataset, taken every 30 days. We fit linear and exponential trendlines through the data.

Analysis

If we assume the trend is exponential, the Passmark trend seems to suggest the 95th-percentile GPU price / half-precision FMA FLOPS of GPUs has fallen by around 40% per year, which would yield a factor of ten in ~4.5 years, with a bootstrap 95% confidence interval 4 to 5.2 years. This is fairly close to the ~4 years / order of magnitude decrease we found when looking at release price data, but we think active prices are a more accurate estimate of the actual prices at which people bought GPUs.

The figures above suggest that certain GPUs with Tensor Cores were a significant (~half an order of magnitude) improvement over existing GPU price / half-precision FMA FLOPS.

Conclusion

We summarize our results in the table below.

Release Prices 95th-percentile Active Prices 95th-percentile Active Prices (pre-crypto price rise) 11/2007 – 1/2020 3/2011 – 1/2020 3/2011 – 12/2016 $ / single-precision FLOPS 12.5 17 16 9/2014 – 1/2020 1/2015 – 1/2020 1/2015 – 12/2016 $ / half-precision FLOPS 8 10 8 $ / half-precision FMA FLOPS 4 4.5 —

Release price data seems to generally support the trends we found in active prices, with the notable exception of trends in GPU price / single-precision FLOPS, which cannot be explained solely by the different start dates. We think the best estimate of the overall trend for prices at which people recently bought GPUs is the 95th-percentile active price data from 2011 – 2020, since release price data does not account for existing GPUs becoming cheaper over time. The pre-crypto trends are similar to the overall trends, suggesting that the trends we are seeing are not anomalous due to cryptocurrency.



Given that, we guess that GPU prices as a whole have fallen at rates that would yield an order of magnitude over roughly:

17 years for single-precision FLOPS

10 years for half-precision FLOPS

5 years for half-precision fused multiply-add FLOPS

Half-precision FLOPS seem to have become cheaper substantially faster than single-precision in recent years. This may be a “catching up” effect as more of the space on GPUs was allocated to half-precision computing, rather than reflecting more fundamental technological progress.

Primary author: Asya Bergal