Hypothesis I: hiatus in temperature trend during 1998–2013

A basic assertion regarding the hiatus is that the steady increase in global surface temperature around a linear positive trend has stopped, or “paused” (Guemas et al. 2013). This sentiment is reflected in statements that “Despite a sustained production of anthropogenic greenhouse gases, the Earth’s mean near-surface temperature paused its rise during the 2000–2010 period” (Guemas et al. 2013), and that “climate skeptics have seized on the temperature trends as evidence that global warming has ground to a halt” (Tollefson 2014). These scientific claims can be turned into a precise statistical null hypothesis: the slope in the regression line of global temperature on time is zero during the hiatus period.

We use three methods with increasing levels of generality to test the above hypothesis. Specific details of the methodology are provided in Supplementary Section 3.1. First, beginning with the 1998–2013 period we fit a standard regression to the response variable global temperature on time during 1998–2013, with errors assumed to be independently and identically distributed (see Fig. 1 for the fit). A two-sided hypothesis test yields a p-value of 0.102 (a one-sided test yields a p-value of 0.051). Thus, the claim of a zero warming trend during the hiatus period cannot be rejected at the 5 % significance level. The second method fits a linear regression with autocorrelated errors that follow a parametric autoregressive model with lag 1. This model aims to directly address the year-to-year temporal dependency present in the global temperature record. Estimating the autoregression and regression parameters using the method of Cochrane and Orcutt (1949), a p-value of 0.075 is obtained for the regression slope coefficient by the bootstrap method (with one-sided p-value less than 5 %). Taking temporal dependence into account, there is now more evidence against the null hypothesis of a climate hiatus. The third method is completely nonparametric, and instead of using the parametric AR(1) approach to model the temporal dependency, a block bootstrap is used which allows for quite general forms of temporal dependence, and yields a two-sided p-value of 0.019. There is now compelling evidence to reject the claim of no warming trend during the 1998–2013 period at the 5 % significance level (and even at the 1 % level for a one-sided test). Moreover, the p-values corresponding to starting years 1999 and 2000 are 0.005 and 0.017 respectively, yielding even lower p-values - and stronger evidence against a hiatus - than when using a starting year of 1998. The sensitivity analysis highlights the fact that choosing the year 1998 had a priori favored the hiatus claim. Moreover, assuming the hiatus as the null makes it harder to conclude otherwise. Regardless, the assertion of a climate hiatus is nevertheless rejected at the 5 % level. We therefore conclude that there is “overwhelming evidence” against the claim that there has been no trend in global surface temperature over the past ≈ 15 years.

Note also that, in applying progressively more general statistical techniques, the scientific conclusions have progressively strengthened from “not significant,” to “significant at the 10 % level,” and then to “significant at the 5 % level.” It is therefore clear that naive statistical approaches can possibly lead to erroneous scientific conclusions. Methods that rely upon a strong modeling assumption of no temporal dependence, or that of a specific form, are less reliable than methods that capture dependence without assuming structural knowledge of the type of dependence.

Hypothesis II: difference in temperature trends

Otto et al. (2013) state that: “the rate of mean global warming has been lower over the past decade than previously.” This statement encompasses a second interpretation of the purported hiatus: that the hiatus represents a “slowdown” of global warming (Chen and Tung 2014), in which the rate of warming is less during the hiatus compared with the warming prior to the hiatus (Chen and Tung 2014; Otto et al. 2013; Smith 2013). This claim can be formulated as a testable statistical hypothesis, where the null hypothesis is that the regression slope before the hiatus period minus the regression slope during the hiatus period is zero or negative, versus the alternative hypothesis that this difference is positive.

We employ three different methods with increasing levels of statistical sophistication to test this hypothesis. Specific details of the methodology are provided in Supplementary Section 3.2. First, a standard regression of global temperature on time is fitted to both the 1998–2013 hiatus period and the period 1950–1997, with errors assumed to be independently and identically distributed (see Fig. 2 top left panel). The first method yields a p-value of 0.210. Thus, there is no evidence of a difference in warming trends even at the 10 % significance level. The second method accounts for the temporal dependency in the global temperature record by using a block bootstrap approach, yielding a p-value of 0.323. The evidence for a difference in trends is further weakened when temporal dependency is accounted for. The third approach uses the method of subsampling (Politis et al. 1999; Rajaratnam et al. 2014) to determine how the current 16-year trend during 1998–2013 compares against all the previous 16-year trends observed between 1950 and 1997. A p-value of 0.3939 is obtained and evidence for the hiatus is further weakened. From the plots in Fig. 2 (bottom panel), observe that during the 1950–1997 period, there are several 16-year periods with both higher and lower linear trends. Therefore the observed trend during 1998–2013 does not appear to be anomalous in a historical context.

Fig. 2 Top panel (left) plot of the global mean land-ocean temperature index, from 1950 to 2013, with the base period of 1951–1980. The regression fits for the two time periods (1950–1997 and 1998–2013) are superimposed. Top panel (right) summary table of results for Hypothesis II Bottom panel (left) time series plot of 16-year observed trends. Bottom panel (right) histogram of 16-year observed trends Full size image

See Fig. 2 (top right panel) for a summary of results of hypothesis II. Varying the cut-off year from 1998 to either 1999 or 2000 yields p-values of 0.214 and 0.348, respectively, for the bootstrap method. Even after properly accounting for temporal dependence, and undertaking a sensitivity analysis, there is no compelling evidence to suggest that the slopes are significantly different. We therefore conclude that the rate of warming over the past ≈ 15 years is not appreciably different from the rate of warming prior to the recent period.

Hypothesis III: hiatus in the mean global temperature

Some claims have simply asserted that the annual mean global temperature has remained constant since 1998 (versus slowing of the trend in global warming). For example, Kosaka and Xie (2013) state that “Despite the continued increase in atmospheric greenhouse gas concentrations, the annual-mean global temperature has not risen in the twenty-first century”, while Tollefson (2014) states that “Average global temperatures hit a record high in 1998 – and then the warming stalled.” This claim can also be precisely formulated as a testable statistical hypothesis. The statistical model can be written as x t = μ t + ε t , where t denotes time (in years), x t is the 1998–2013 global mean temperature anomalies series, μ t is the mean parameter and ε t is the random noise component(with \(\mathbb {E}(\varepsilon _{t}) = 0, \mathbb {V}\text {ar}(\varepsilon _{t}) = \sigma ^{2}\)). The corresponding null hypothesis and alternative are given as \( H_{0}: \mathbb {E}(x_{1998}) = \mathbb {E}(x_{1998 + t}) \hspace {0.1in} \text {for} \;\; t=1,2,\cdots ,15\hspace {0.1in} \text {versus} \hspace {0.1in} H_{A}: \mathbb {E}(x_{1998})

eq \mathbb {E}(x_{1998 + t})\).

Specific details of the methodology are provided in Supplementary Section 3.3. Hypothesis III is tested in four different ways. There are two options for determining the value of \(\mathbb {E}[x_{1998}]=\mu _{1998}\) : to directly use the observed 1998 temperature record x 1998 as a substitute for μ 1998 , or to alternatively estimate μ 1998 from the regression line from the period 1950–1997. Figure 3 (top panel) illustrates this concept. As the two approaches for specifying μ 1998 yield fixed values, the inherent variability therein can be explicitly accounted for by using the bootstrap. Doing so propagates the variability in a rigorous manner. The table in Fig. 3 (bottom panel) summarizes the results of testing hypothesis III.

Fig. 3 Top panel figure illustrating how the mean μ 1998 can be estimated. Bottom panel summary table of results for Hypothesis III with 1998 as start of hiatus period Full size image

For Method A, when x 1998 is used as a substitute for μ 1998 , the statistical test concludes that the mean has decreased during the hiatus, and thus strongly favors the hiatus claim. However, since this one single observed value is not a consistent estimate of μ 1998 , the conclusion is not reliable. In Method B when μ 1998 is estimated from the 1950–1997 regression line, the null hypothesis is rejected in the opposite direction, suggesting that the mean temperature has actually increased during the hiatus period. Thus, the selection effect from choosing 1998 as the reference cut-off year has a tremendous impact on the statistical conclusion. Method C, which specifically incorporates the variability inherent in estimating μ 1998 as x 1998 leads to a different conclusion than in Method A. In particular, as soon as the variability in estimating μ 1998 to be x 1998 is incorporated, one can no longer reject the null hypothesis that the mean has remained constant - even when the high value x 1998 is used. Method D uses a value for μ 1998 which is estimated from the 1950–1997 regression and also incorporates the variability of this estimate. Here the assertion that the mean is either zero, or has decreased, is rejected.

Given the results of this nuanced analysis, we conclude that claims that the global mean temperature has not changed in recent decades are not supported by evidence. In addition, our nuanced analysis gives much needed rigor to the claim that using 1998 as a reference year amounts to “cherry picking” (Leber 2014; Stover 2014), see also Supplemental Section for detailed discussions). The results are further validated when the analysis is repeated with 1999 and 2000 as the starts of the hiatus period (see Supplemental Section 3.3). Note furthermore that since 2014 was the warmest year on record Karl et al. (2015), ignoring 2014 in our analysis can be viewed as being even more conservative, similar to using 1998 as the starting point.

Hypothesis IV: difference in year-to-year temperature changes

It is also instructive to extend the analysis above without relying on a linear model to understand trends or means. One such approach is to assess whether the distribution of year-to-year temperature changes is markedly different between the hiatus period and the prior periods. Such analysis is inherently less reliant on a statistical model of temperature on time, and hence makes fewer assumptions. The scientific assertion here is that year-to-year changes in global mean temperature during 1998–2013 are different from those during 1950–1997. Under the null hypothesis, these year-to-year changes are assumed to come from a common underlying distribution, though we do not assume that the observations of differences are independent. This framework also allows for testing of specific features of the distribution, including changes in the mean, median and variance. The empirical distribution of annual changes in the global temperature can be constructed by taking first differences: the global mean temperature during a given year is subtracted from the global mean temperature in the previous year. The first differences during 1998–2013 give rise to a 15-year times series of temperature changes. Differences in distribution (using the Kolmogorov-Smirnov (K-S) statistic), in means, medians and variances are tested using the block bootstrap and subsampling, thus taking temporal dependency fully into account. Specific details of the methodology are provided in Supplementary Section 3.4.

The results of this analysis are given in Fig. 4. Using either bootstrap or subsampling there is no evidence at the 5 % significance level to suggest that the distribution of changes during the hiatus period is different from the previous period 1950–1997. The same applies to the mean and variance of the distributions. The difference in medians is not statistically significant at the 5 % level using the block bootstrap approach, but is significant when using subsampling. However this difference in medians completely disappears when the starting year of the hiatus is changed to either 1999 or 2000, hence the result is not robust (see Table S8 in Supplemental Section 3.4). Given these results, we conclude that the distribution of annual changes in global temperature has not been different in the past 15 years than earlier in the global temperature record.

Fig. 4 Top panel time series plot of 15-year observed KS differences. Bottom panel summary table of results for Hypothesis IV using bootstrap and subsampling Full size image

Re-analyzing recently-updated global temperature observations

We have also implemented our methodology on the recently released ERSSTv4 dataset to compare our results to the results obtained in a recent paper by Karl et al. (2015). Unlike the study by Karl et al. (2015), we do not indirectly impose Gaussianity on the temperature data (in the most general approach that we propose for each hypothesis). We also do not impose an autoregressive structure for modeling the temporal dependence. Instead we account for the temporal dependency more flexibly and non-parametrically using the circular block bootstrap and related methods. The increased sophistication allows one to have more confidence in the results’ general validity as our approach makes fewer assumptions. The end result is also compelling. First, the results in Karl et al. (2015) show a positive slope during the hiatus period (Hypothesis I) only at the 10 % significance level. Our analysis shows however that removing the arbitrary and parametric autoregressive structure on the residuals and using the block bootstrap yields significance at the 0.1 % level. The p-value stemming from our approach is less than 0.0005. The implication of the much stronger conclusion is that the warming trend observed during 1998–2014Footnote 1 arising from a model of no warming is less than 1 in 2000 (as compared to less than 1 in 20 from Karl et al. (2015)). Thus the conclusion is made stronger by a factor of 100 using the methodology we have developed.

Now consider hypothesis II which compares the warming trend during the hiatus period to that in the previous period (1950–1997). Karl et al. (2015) assert that the analysis on the corrected NOAA global temperature shows that the 90 % confidence interval for the trend in the hiatus period encompasses that of the previous period. Note that this confidence interval is based on the period 1998–2012 and is thus calculated on only 15 years of data. Since the theoretical justification of such confidence intervals is valid for large sample sizes, it is not clear how reliable the conclusion really is. On the other hand, our subsampling methodology for comparing the trends in the two periods is applicable even when the sample size in the hiatus period is small. In particular, the validity of the subsampling approach here does not rely on asymptotic arguments (i.e., increasing sample sizes) during the hiatus period. Details of the analysis are given in Tables S11 and S12 in Supplementary Section 6.

Recall that the analysis by Karl et al. (2015) requires the use of the corrected NOAA dataset to reject the claim of a hiatus. We note that our analysis rejects the hiatus claim even when using the older NOAA temperature dataset (that is, even without correcting for the data biases). The use of methodology with far fewer restrictive assumptions appears to be more robust to errors in the data. This may not be unexpected since biases in the data tend to violate basic parametric assumptions, whereas the less restrictive techniques, such as the ones we develop, can handle a variety of data generating mechanisms simply by their very non-parametric nature.

Note that, by and large, the conclusions reached by Karl et al. (2015) and our conclusions agree. However, it is important to mention that an approach based on stringent or unrealistic assumptions which agrees with our conclusions for this dataset may fail to do so on another dataset.

Summary

We summarize the overall results from all four hypothesis tests I, II, III and IV in Tables 5 and 6 in Supplementary Section 4. These two tables also analyze the sensitivity of the results to two important factors: first when the cut-off year is changed from 1998 to either 1999 or 2000; and second when the NOAA or HadCRUT4 datasets are used instead of the NASA-GISS dataset. As there are four hypotheses being tested, using a battery of rigorous test procedures, the number of hypothesis being tested are numerous. Hence the issue of multiple hypothesis testing surfaces. In particular, a certain number of these hypotheses are expected to be falsely rejected by chance alone, casting further doubt on any of the hiatus claims.

Our rigorous statistical framework yields strong evidence against the presence of a global warming hiatus. Accounting for temporal dependence and selection effects rejects - with overwhelming evidence - the hypothesis that there has been no trend in global surface temperature over the past ≈15 years. This analysis also highlights the potential for improper statistical assumptions to yield improper scientific conclusions. Our statistical framework also clearly rejects the hypothesis that the trend in global surface temperature has been smaller over the recent ≈ 15 year period than over the prior period. Further, our framework also rejects the hypothesis that there has been no change in global mean surface temperature over the recent ≈15 years, and the hypothesis that the distribution of annual changes in global surface temperature has been different in the past ≈15 years than earlier in the record. Taken together, these results clearly reject the presence of a hiatus, pause, or slowdown in global warming. In rejecting all four hiatus hypotheses, our results instead demonstrate that the evolution of global surface temperature over the past 1–2 decades is not abnormal or unexpected within the context of the long-term record of variability and change.

Without empirical evidence in support of the hiatus claims, the assumption that there has been a hiatus/pause/slow-down in global warming should be called into question. That being said, recent work investigating the geophysical causes of the recent temperature time series have provided valuable insights into the processes that create decadal-scale variability in global temperature within a long-term trend of global warming. Moreover, it is also useful that errors in data aggregation have been corrected in the recent work of Karl et al. (2015).