Data compilation

Our compilation comprises 194 temperature changes and associated timespans taken from 93 publications in the peer reviewed scientific literature (see Supplementary Data 1 for full data set). These changes are quoted in the text of the published works, with timespan data either also quoted or derived from figures or additional references (Supplementary Data 1). The data in the compilation are assumed to be representative of the most relevant climatic changes in a given record and of sufficient significance to the scope of the published work to be worthy of quoting. The temperature changes that our compilation provides are not necessarily representative of global changes. Rather, the data are representative of local palaeoclimatic records, which serve as the only empirical archive of Earth’s climatic evolution. SST data dominate the literature and the compilation (132 data points). We did not include polar temperature records owing to the demonstrable higher magnitudes of polar atmospheric temperature change19,22 (see also Fig. 2). However, 31 terrestrial temperature changes were included to evaluate consistency in the scaling of oceanic and continental data (Fig. 1). Nine data points are of recent temperature changes that have been directly observed over the last <150 years, and are global averages.

Inescapable biases in our compilation include an age-dependence on timespan, because timespans cannot exceed twice the median age of a succession. Equally, timespan is dependent on dating method. Milankovitch dating offers the highest resolving power in records lacking radiocarbon control (that is, any record older than ∼50 kyr) but is likely incapable of providing precision of <1,000 years12. Similarly, palaeoproxies have errors typically >1 °C (refs 28, 29, 30). The broad match between the scaling of maximum SST changes in our compilation and the ensembled pattern of SST changes of the past ∼13 Myr (which have age models based primarily on Milankovitch dating) suggest that any biases related to age or dating control do not affect the veracity of our inferences, which is also supported by our Monte Carlo error analysis.

Ensemble data

Low and mid latitude, multiproxy SST data spanning the past ∼13 Myr were compiled from 27 individual records from 20 separate, globally distributed ocean drilling sites. SST data were compiled using the published proxy temperature calibrations and age models as indicated in Supplementary Table 1.

Data analysis

Scaling statistics (slopes) were conducted using least squares regression of log-transformed data. Standard error of the slopes in the raw data were calculated using a bootstrap approach (random sampling with replacement) with 10,000 iterations. Hypothesis testing was based on non-parametric correlation tests (Spearman’s ρ), because all of the timespan data and most of the magnitude data are significantly not normally distributed even after log transformation (Shapiro–Wilk test). The quoted P values represent the probability of correlation coefficients at least as high as calculated arising by chance if no correlation existed. As outlined in the main text, our statistical analyses specifically tested the scaling of magnitude with timespan owing to the spurious correlation that arises when investigating the relationship between timespan and rate15. Taking the actual timespans but randomly shuffling the associated magnitude data produces a significant correlation (P value<0.05) between timespan and calculated rate in 100% of 10,000 trials (mean ρ of −0.98). Conversely, 4.84% of trials yielded significant P values in the regression analysis of timespan and magnitude (mean ρ of 0).

The sensitivity of the scaling relationship between timespan and magnitude of temperature change to errors was tested using a Monte Carlo approach. 10,000 versions of our compilation were created with randomly applied normally distributed timespan errors of ±50% (2σ), and normally distributed magnitude errors of ±4 °C (2σ). Our compilation is dominated by O-isotope, TEX 86 , alkenone and Mg/Ca proxies (153 data points), which have typical errors of ≤4 °C (refs 1, 28, 29, 30). Our modelled errors likely exceed the magnitude uncertainties in the records we studied because we are interested only in relative changes in temperature within individual records rather than the absolute temperatures. In our approach, magnitude trends can change sign after addition of errors (that is, warming becomes cooling). Tests using a routine that ignored trend reversals caused by the addition of errors (‘rejection sampling’) demonstrated that the combined oceanic and continental data have P values of <0.01 in a fractionally lower proportion of simulations (99.92% versus 99.99%). Issues of stratigraphic incompleteness are unlikely to impact timespan estimates unless the timespan is defined through a cumulative dating method such as astronomical cycle calibration, where missing cycles lead to an underestimation of true timespan. Errors of the size we use here for Monte Carlo analysis are nevertheless unlikely to result from incompleteness. Indeed, tests using much larger timespan errors of ±70% (2σ) still yielded significant P values (P value<0.01) in >99% of simulations (simulations with timespans that become negative after error addition are rejected). The mean Spearman’s ρ coefficient describing the correlation between timespan and magnitude in our 10,000 simulations for the combined ocean and continental data is 0.37±0.09 (2σ). Analysis of separate continental and ocean data demonstrates that the scaling in the continental data is unlikely to be robust in the presence of significant errors owing to the small size of this data set (N=31), with P values of <0.05 yielded in ∼76% of simulations, with a mean ρ of 0.43±0.22. In contrast, analysis of the oceanic data demonstrates P values <0.05 in 99.99% of simulations, with mean ρ=0.35±0.10. The mean of the calculated slopes in our simulations was 0.09, with an uncertainty of±0.03 (2σ). Thus, the uncertainty in the slope from the Monte Carlo error analysis is the same as that derived from the regression and bootstrapping of the raw data (0.10±0.03), albeit with a fractionally lower slope value. We employed bootstrapping (N=1,000) to quantify the 95% uncertainty interval for the calculated slope of each individual Monte Carlo simulation. Our results indicate that slopes were significantly >0 in >99% of the simulations.

To extract timespan-dependent temperature changes from the ensemble and polar temperature data presented in Fig. 2, each temperature value in each individual record was subtracted from every other value in the record to define magnitudes. The published age for each value in the records provided timespan information. The routine worked forwards in time (that is, starting with the oldest value) to define trends (warming or cooling) and avoid duplication. To define maximum attained magnitudes of temperature changes (that is, black and grey lines and red dashed lines in Fig. 2), the magnitude data were binned in 0.5 log bins with the largest value in each bin defining the maximum magnitude for that timespan range.