Abstract Forecasting technological progress is of great interest to engineers, policy makers, and private investors. Several models have been proposed for predicting technological improvement, but how well do these models perform? An early hypothesis made by Theodore Wright in 1936 is that cost decreases as a power law of cumulative production. An alternative hypothesis is Moore's law, which can be generalized to say that technologies improve exponentially with time. Other alternatives were proposed by Goddard, Sinclair et al., and Nordhaus. These hypotheses have not previously been rigorously tested. Using a new database on the cost and production of 62 different technologies, which is the most expansive of its kind, we test the ability of six different postulated laws to predict future costs. Our approach involves hindcasting and developing a statistical model to rank the performance of the postulated laws. Wright's law produces the best forecasts, but Moore's law is not far behind. We discover a previously unobserved regularity that production tends to increase exponentially. A combination of an exponential decrease in cost and an exponential increase in production would make Moore's law and Wright's law indistinguishable, as originally pointed out by Sahal. We show for the first time that these regularities are observed in data to such a degree that the performance of these two laws is nearly the same. Our results show that technological progress is forecastable, with the square root of the logarithmic error growing linearly with the forecasting horizon at a typical rate of 2.5% per year. These results have implications for theories of technological change, and assessments of candidate technologies and policies for climate change mitigation.

Citation: Nagy B, Farmer JD, Bui QM, Trancik JE (2013) Statistical Basis for Predicting Technological Progress. PLoS ONE 8(2): e52669. https://doi.org/10.1371/journal.pone.0052669 Editor: Luís A. Nunes Amaral, Northwestern University, United States of America Received: June 6, 2012; Accepted: November 19, 2012; Published: February 28, 2013 Copyright: © 2013 Nagy et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Funding: This work was funded under National Science Foundation grant NSF-0738187. http://www.nsf.gov/. Partial funding for this work was provided by The Boeing Company. http://www.boeing.com/. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: The authors declare receipt of funding from The Boeing Company. There are no other declarations relating to employment, consultancy, patents, products in development or marketed products. This does not alter the authors' adherence to all the PLOS ONE policies on sharing data and materials.

Introduction Innovation is by definition new and unexpected, and might therefore seem inherently unpredictable. But if there is a degree of predictability in technological innovation, understanding it could have profound implications. Such knowledge could result in better theories of economic growth, and enable more effective strategies for engineering design, public policy design, and private investment. In the area of climate change mitigation, the estimated cost of achieving a given greenhouse gas concentration stabilization target is highly sensitive to assumptions about future technological progress [1]. There are many hypotheses about technological progress, but are they any good? Which, if any, hypothesis provides good forecasts? In this paper, we present the first statistically rigorous comparison of competing proposals. When we think about progress in technologies, the first product that comes to mind for many is a computer, or more generally, an information technology. The following quote by Bill Gates captures a commonly held view: “Exponential improvement – that is rare – we've all been spoiled and deeply confused by the IT model” [2]. But as we demonstrate here, information technologies are not special in terms of the functional form that describes their improvement over time. Information technologies show rapid rates of improvement, but many technologies show exponential improvement. In fact, all the technologies we study here behave roughly similarly: Information technologies closely follow patterns of improvement originally postulated by Wright for airplanes [3]–[8], and technologies such as beer production or offshore gas pipelines follow Moore's law [9], [10], but with a slower rate of improvement [8], [11]–[15]. It is not possible to quantify the performance of a technology with a single number [16]. A computer, for example, is characterized by speed, storage capacity, size and cost, as well as other intangible characteristics such as aesthetics. One automobile may be faster, while another is less expensive. For this study, we focus on one common measure of performance: the inflation-adjusted cost of one “unit”. This metric is suitable in that it can be used to describe many different technologies. However, the nature of a unit may change over time. For example, a transistor in a modern integrated circuit today may have quite different performance characteristics than its discrete counterpart in the past. Furthermore, the degree to which cost is emphasized over other performance measures may change with time [17]. We nonetheless use the changes in the unit cost as our measure of progress, in order to compare competing models using a sizable dataset. The crudeness of this approach only increases the difficulty of forecasting and makes it particularly surprising that we nonetheless observe common trends.

Analysis We test six different hypotheses that have appeared in the literature [3], [9], [18]–[20], corresponding to the following six functional forms: (1) The dependent variable is the unit cost of the technology measured in inflation-adjusted dollars. The independent variables are the time (measured in years), the annual production , and the cumulative production . The noise term , the constants , and , and the predictor variables differ for each hypothesis. Moore's law here refers to the generalized statement that the cost of a given technology decreases exponentially with time: (2)where and are constants [9], [12]. (We assume throughout that , and we have renamed and in Eq. (1)). Moore's law postulates that technological progress is inexorable, i.e. it depends on time rather than controllable factors such as research and development. Wright's law, in contrast, postulates that cost decreases at a rate that depends on cumulative production: (3)where and are constants, and we have renamed and in Eq. (1). Wright's law is often interpreted to imply “learning by doing” [5], [21]. The basic idea is that cumulative production is a proxy for the level of effort invested, so that the more we make the more we learn, and knowledge accumulates without loss. Another hypothesis is due to Goddard [18], who argues that progress is driven purely by economies of scale, and postulates that: (4)where and are constants, and we have renamed and in Eq. (1). We also consider the three multi-variable hypotheses in Eq. (1): Nordhaus [20] combines Wright's law and Moore's law, and Sinclair, Klepper, and Cohen (SKC) [19] combine Wright's law and Goddard's law. For completeness, we also test Wright's law lagged by one year. Note that these methods forecast different things: Moore's law forecasts the cost at a given time, Wright's law at a given cumulative production, and Goddard's law at a given annual production. We test these hypotheses on historical data consisting of 62 different technologies that can be broadly grouped into four categories: Chemical, Hardware, Energy, and Other. All data can be found in the online Performance Curve Database at pcdb.santafe.edu. The data are sampled at annual intervals with timespans ranging from 10 to 39 years. The choice of these particular technologies was driven by availability – we included all available data, with minimal constraints applied, to assemble the largest database of its kind. The data was collected from research articles, government reports, market research publications, and other published sources. Data on technological improvement was used in the analysis if it satisfied the following constraints: it retained a functional unit over the time period sampled, and it included both performance metric (price or cost per unit of production) and production data for a period of at least 10 years, with no missing years in between. This inclusive approach to data gathering was required to construct a large dataset, which was necessary to obtain statistically significant results. The resulting 62 datasets are described in detail in File S1. These datasets almost certainly contain significant measurement and estimation errors, which cannot be directly quantified and are likely to increase the error in forecasts. Including many independent data sets helps to ensure that any biases in the database as a whole are random rather than systematic, minimizing their effects on the results of our analysis of the pooled data. To compare the performance of each hypothesis we use hindcasting, which is a form of cross-validation. We pretend to be at time and make a forecast for time using hypothesis (functional form) and data set , where . The parameters for each functional form are fitted using ordinary least squares based on all data prior to time , and forecasts are made based on the resulting regression. We score the quality of forecasts based on the logarithmic forecasting error: (5) The quality of forecasts is examined for all datasets and all hypotheses (and visualized as a three-dimensional error mountain, as shown in File S1). For Wright's law, an illustration of the growth of forecasting errors as a function of the forecasting horizon is given in Fig. 1. PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Figure 1. An illustration of the growth of errors with time using the Wright model. The mean value of the logarithmic hindcasting error for each dataset is plotted against the hindcasting horizon , in years. An error of , for example, indicates that the predicted value is three times as big as the actual value. The longest data-sets are: PrimaryAluminum (green), PrimaryMagnesium (dark blue), DRAM (grey), and Transistor (red). https://doi.org/10.1371/journal.pone.0052669.g001 An alternative to our approach is to adjust the intercepts to match the last point. For example, for Moore's law this corresponds to using a log random walk of the form , where is an IID noise term (see File S1). We have not done this here to be consistent with the way these hypotheses have been presented historically. The method we have used also results in more stable errors. Developing a statistical model to compare the competing hypotheses is complicated by the fact that errors observed at longer horizons tend to be larger than those at shorter horizons, and errors are correlated across time and across functional forms. After comparing many different possibilities (as discussed in detail in File S1), we settled on the following approach. Based on a search of the family of power transformations, which is known for its ability to accommodate a range of variance structures, we take as a response the square root transformation of the logarithmic error. This response was chosen to maximize likelihood when modeled as a linear function of the hindcasting horizon target origin , using a linear mixed effects model. Specifically, we use the following functional form to model the response: (6)where is the expected root error. The parameters and depend on the functional form and are called fixed effects because they are the same for all datasets. is the intercept and is the slope parameter. The parameters and depend on the dataset, and are called random effects because they are not fitted independently but are instead treated as dataset-specific random fluctuations from the pooled data. The quantities and are additive adjustments to the average intercept and slope parameters and , respectively, to take into account the peculiarities of each dataset . In order to avoid adding 62 parameters plus 62 parameters, we treated the pair as a two-dimensional random vector having a bivariate normal distribution with mean and variance-covariance matrix . This approach dramatically reduces the number of parameters. We parameterize the dataset-specific adjustments as random deviations from the average at a cost of only 3 additional parameters instead of 2 62 124. This parsimonious approach makes maximum likelihood estimation possible by keeping the number of parameters in check. Finally, we add an random field term to take into account the deviations from the trend. This is assumed to be a Gaussian stochastic process independent of the random vector, having mean , and given and , having variance equal to a positive times the fitted values: (7) We also define an exponential correlation structure within each error mountain (corresponding to each combination of dataset and hypothesis, see File S1), as a function of the differences of the two time coordinates with a positive range parameter and another small positive nugget parameter quantifying the extent of these correlations: (8)where the two Kronecker functions ensure that each error mountain is treated as a separate entity. Equations (7) and (8) were chosen to deal with the observed heteroscedasticity (increasing variance with increasing logarithmic forecasting error) and the serial correlations along the time coordinates (hindcasting origin) and (hindcasting target). Based on the likelihood, an exponential correlation function provided the best fit. Note that instead of a Euclidean distance (root sum of the squares of differences), the Manhattan measure was used (the sum of the absolute differences), because it provided a better fit in terms of the likelihood. Using this statistical model, we compared five different hypotheses. (We removed the Nordhaus model from the sample because of poor forecasting performance [20]. This model gave good in-sample fits but generated large and inconsistent errors when predicting out-of-sample, a signature of over-fitting. This points to the difficulty in separating learning from exogenous sources of change [20].) Rather than the parameters needed to fit each of the 62 datasets separately for each of the five functional forms, there are only free parameters: = 10 parameters and , three parameters for the covariance matrix of the bivariate random vector , and three parameters for the variance and autocorrelation of the residuals .

Acknowledgments We thank all contributors to the Performance Curve Database (pcdb.santafe.edu).

Author Contributions Developed the concept of the study: JET JDF. Conceived and designed the experiments: JET JDF BN. Performed the experiments: BN. Analyzed the data: QMB. Wrote the paper: JET JDF BN.