Global mean surface temperatures are rising in response to anthropogenic greenhouse gas emissions. The magnitude of this warming at equilibrium for a given radiative forcing—referred to as specific equilibrium climate sensitivity (S)—is still subject to uncertainties. We estimate global mean temperature variations and S using a 784,000-year-long field reconstruction of sea surface temperatures and a transient paleoclimate model simulation. Our results reveal that S is strongly dependent on the climate background state, with significantly larger values attained during warm phases. Using the Representative Concentration Pathway 8.5 for future greenhouse radiative forcing, we find that the range of paleo-based estimates of Earth’s future warming by 2100 CE overlaps with the upper range of climate simulations conducted as part of the Coupled Model Intercomparison Project Phase 5 (CMIP5). Furthermore, we find that within the 21st century, global mean temperatures will very likely exceed maximum levels reconstructed for the last 784,000 years. On the basis of temperature data from eight glacial cycles, our results provide an independent validation of the magnitude of current CMIP5 warming projections.

Keywords

Specific equilibrium climate sensitivity (S) is a key parameter to characterize the response of our climate system to external perturbations of the radiative balance, such as the recent increase in greenhouse gas concentrations. Paleoclimate data have been previously used to determine the possible range of S ( 1 – 5 ). The 5 to 95% range for paleovalues of S is estimated to be 1° to 6°C per CO 2 doubling ( 6 ). Numerous studies have demonstrated that S may have very different values for cold (for example, glacial) and warm (for example, interglacial) periods ( 7 – 10 ). To further constrain S and determine its potential background state dependence from paleoclimate data, we estimate global mean surface air temperature (SAT) variability from a network of paleo–sea surface temperature (SST) proxies extending back to 784 ka (thousand years ago) and a transient Earth system model simulation ( 11 ) conducted with the LOVECLIM (LOch-Vecode-Ecbilt-CLio-agIsm Model) coupled atmosphere–ocean–sea ice–vegetation model ( 12 ). Appropriate scaling factors need to be applied to capture the fact that the SST proxy network spatially undersamples SAT variability and that the climate model simulation underestimates the magnitude of the reconstructed SST variability. The relationship between reconstructed past SAT variability and estimates of the radiative forcing will provide key insights into the nonlinearity of S over the past 784 ka and allow us to compute the magnitude of future greenhouse gas–induced warming from paleoclimate data.

Another source of uncertainty is introduced through different warming levels during previous interglacial periods. For example, our proxy-based temperature reconstruction exhibits an overall stronger interglacial warming than does our model-based one ( Fig. 2A ). As a result, S warm derived by using only the proxy-based global mean SAT reconstruction is slightly larger than the S warm calculated from the model-based reconstruction (table S3), even though the PI-LGM temperature amplitude is larger for the model-based SAT change. This highlights the need for a combined model-proxy approach for deriving SAT reconstructions.

The radiative forcing and global mean SAT reconstruction are both subject to considerable uncertainties that affect the estimate of S and thus the resulting global warming projection. Dust changes are the largest contributor to the uncertainty in global radiative forcing. Reported values for the glacial-interglacial amplitude of the global mean dust forcing range from ~0.33 W/m 2 ( 29 ) to more than 3 W/m 2 [see discussion by Köhler et al. ( 1 )]. Here, we assume a published value of 1.9 W/m 2 that is prone to large uncertainties. Regarding estimates of global mean SAT change, there appears to be a reasonable agreement between different studies with respect to the temporal evolution (fig. S5). However, the correct PI-LGM amplitude of global mean SAT change remains less constrained. Published amplitudes range from relatively moderate values of 3.00 K (90% probability range of 1.7 to 3.7 K) ( 3 ) and 4.00 ± 0.80 K ( 21 ) to values of 5.80 ± 1.4 K ( 18 ) and 6.20 K (90% probability range of 4.60 to 8.30 K) ( 19 ). To calculate the effect of uncertainties in temperature reconstructions on our estimate of S warm and the resulting global warming projection, we scale our global mean SAT reconstruction to other reported values. Furthermore, we use an Antarctic temperature reconstruction ( 30 ) scaled by a high and a low polar amplification factor ( 31 ). Using a PI-LGM amplitude of only 3 K ( 3 ) for our global mean SAT reconstruction results in an S warm of only 0.68 K W −1 m 2 (or 2.52 K per CO 2 doubling) (table S3), which is approximately half the value found for our original temperature reconstruction. The associated greenhouse warming for the year 2100 amounts to 3.84 K. An amplitude of 4.00 K, as derived from a multimodel ensemble in combination with global proxy data ( 21 ), results in a warming of 4.68 K by 2100 CE. On the basis of the estimates of S using scaled Antarctic temperature reconstruction, the global warming at the end of the 21st century amounts to 6.32 and 4.87 K for the low (1.2) and the high (1.9) polar amplification, respectively.

Ensemble mean simulated global mean surface temperature (K) evolution using all models of the CMIP5 multimodel ensemble in their historical and RCP8.5 simulations (thick red line) and corresponding uncertainty range (orange shading), along with estimates (blue) based on S warm , an estimate of ocean heat uptake efficiency and the RCP8.5 radiative forcing time series. The corresponding uncertainty range is depicted as cyan shading (see Materials and Methods).

To use paleoclimate estimates of S to calculate future warming, we have to take into account the fact that the anthropogenic SAT increase until the end of the 21st century cannot be regarded as an equilibrium response to the perturbation in Earth’s radiation balance but as a transient process in which the oceanic heat uptake is critical. On the basis of a recent approximation for the slowdown effect of the ocean’s thermal inertia on the warming ( 28 ), we can convert the specific equilibrium climate sensitivity into a transient climate response (see Materials and Methods). With an ensemble mean estimate of the ocean heat uptake efficiency ( 28 ) and the above S warm of 4.88 ± 0.57 K per CO 2 doubling, the transient climate response amounts to 2.74 K per CO 2 doubling, with a likely range (indicating a 66 to 100% likelihood) of 2.23 to 3.43 K per CO 2 doubling. Moreover, using our paleodata-based estimate of the transient climate response and the radiative forcing of the RCP8.5 scenario ( Fig. 2E ), we then calculate the global SAT evolution over the next 85 years. The paleodata-based projection of future warming results in larger global mean SAT anomalies compared to the CMIP5 ensemble mean SAT projection ( Fig. 4 ). In response to the RCP8.5 greenhouse gas emission scenario, the global SAT increase between 2100 CE and 1880 CE (PI conditions) amounts to 5.86 K (with a likely range of 4.78 to 7.36 K) when estimated based on eight glacial cycles of SST data and to 4.86 K (with an ensemble range of 3.42 to 6.40 K) when simulated by the CMIP5 ensemble mean.

Scatter diagram (circles) of reconstructed global mean SAT anomalies (K) ( Fig. 2B ) versus net radiative forcing anomalies (W/m 2 ) ( Fig. 2D ) for the last 784,000 years. Anomalies are calculated with respect to PI values. Two-dimensional kernel density estimate of paleo-SAT/radiative forcing data (blue shading). The thick dashed yellow curve represents nonlinear regression of paleo-SAT/radiative forcing data, along with uncertainty ranges (dashed black curves; see Materials and Methods). The thick cyan line represents linear regression for cold phases. The slope represents S cold . The thick red line represents linear regression for warm phases. The slope represents S warm . Dashed horizontal lines denote warm (orange) and cold (blue) phases using 1 SD of the reconstructed global mean SAT anomalies as a separator. Cold (warm) phases are defined by SAT anomalies of <−5.12 K (>−1.66 K). The CMIP5 transient model projections using the RCP8.5 forcing scenario are presented by purple circles. Using S warm (orange shading) and taking into account the ocean heat uptake efficiency, we can calculate the transient response to the RCP8.5 radiative forcing. The resulting paleo-based projection with the corresponding uncertainty ranges is represented by cyan shading (see Materials and Methods).

The specific equilibrium climate sensitivity is calculated as the change in global mean SAT for a given change in radiative forcing. To allow for a comparison with the CMIP5 projections, our estimate of S needs to be consistent with the CMIP5 simulations regarding applied forcings and resolved feedback processes. Thus, we calculate S as a climate sensitivity, for which albedo changes due to sea ice, land snow, and vegetation coverage are treated as feedbacks rather than as forcings. Explicitly considered radiative forcings for our calculation of S include changes in greenhouse gases (GHG), land ice (LI), and aerosols (AE) [referred to as S GHG,LI,AE in the study by Rohling et al. ( 27 ); see Materials and Methods for details and for an additional calculation using vegetation as forcing]. A scatter diagram ( Fig. 3 ) between reconstructed global mean SAT anomalies ( Fig. 2B ) and radiative forcing anomalies ( Fig. 2D ) (relative to their PI states) reveals a considerable background state dependence of the climate sensitivity. The local slope of the scatter plot increases notably with increasing global mean SAT, thus indicating an underlying nonlinearity in S. This result is qualitatively in good agreement with another recent study ( 7 ). When calculating S for specific periods of time, dating uncertainties can result in large biases of S. Thus, to obtain and compare average specific equilibrium climate sensitivities for cold and warm climates during the last 784 ka, we apply a polynomial fit to the entire data set (see Materials and Methods and Fig. 3 , yellow curve) and then calculate the mean slope of this fit for cold and warm phases ( Fig. 3 , cyan and red lines). Here, we use 1 SD in SAT as a separator so that warm (cold) phases and the corresponding S are defined by SATs of at least 1 SD above (below) the mean SAT ( Fig. 3 , dashed horizontal lines). The resulting mean of S for cold climates (S cold ) amounts to 0.48 K W −1 m 2 , which corresponds to 1.78 K per CO 2 doubling. For warm climates, the value (S warm ) is more than two times larger, attaining 1.32 K W −1 m 2 or 4.88 K per CO 2 doubling. The average of S over the entire 784-ka range can be calculated from a linear regression of the SAT/radiative forcing data set. It amounts to 3.22 K per CO 2 doubling. Comparing the mean of S to S warm , it becomes apparent that this long-term mean value substantially underestimates S warm and thus should not be used to assess future anthropogenic warming.

Earlier estimates of S mainly relied on deep ocean temperature conversion of oxygen isotope data ( 22 ), assumptions regarding polar amplification ( 7 ), shorter paleorecords ( 2 , 26 ), or paleo–time slices ( 3 , 5 ). The nature of our globally averaged SAT reconstructions allows us to independently estimate S and its background state dependence by using an extended data period and a network of surface temperature proxies. Estimates of greenhouse and dust radiative forcings for the last 784 ka are obtained following Rohling et al. ( 2 ) and Köhler et al. ( 1 ), respectively (see Materials and Methods). Upper atmosphere net shortwave radiative forcing from ice sheets is directly calculated from the 784-ka transient Earth system model simulation ( Fig. 2D and Materials and Methods). The resulting combined radiative forcing anomaly ( Fig. 2D ) reaches minimum values of −6.5 W/m 2 (relative to the PI mean state) during extreme glacials, such as the LGM.

Carbon emissions from human activities have pushed the atmospheric carbon dioxide concentration substantially above the maximum values reached during interglacials in the last 800,000 years ( 25 ). Reaching atmospheric CO 2 concentrations of up to ~900 parts per million (ppm) by 2100 CE, the ensemble of Representative Concentration Pathway 8.5 (RCP8.5) scenario Coupled Model Intercomparison Project Phase 5 (CMIP5) simulations ( 6 ) projects a global mean SAT rise (relative to PI conditions) of 4.84 K with an ensemble range of 3.42 to 6.40 K. Using the average for the most recent millennium of our 784-ka global mean SAT reconstruction as a preindustrial reference state allows us to contextualize the projected warming with respect to the range of past glacial-interglacial climate variability. It becomes evident ( Fig. 2 , B and C) that the maximum global mean paleo-SAT reconstructed for MIS5e will very likely be exceeded by the ongoing greenhouse warming within the 21st century. The RCP8.5 ensemble mean CMIP5 projections for 2100 CE correspond to a tripling of the maximum long-term warming (relative to PI conditions), as reconstructed for the last eight glacial cycles. However, because our temperature reconstruction is based on 1000-year averages, we cannot rule out the possibility that global mean SAT in the past has been warmer on time scales shorter than the one considered by our reconstruction.

In a last step and to highlight the common variability, the two independent global SAT estimates are averaged. For the time period 10 to 0 ka B.P., which is not covered by the proxy-based reconstruction, only the model-based estimate is used. Extending back to 784 ka, the new global SAT reconstruction ( Fig. 2B ) illustrates that the Holocene has been significantly colder than the interglacial states MIS5e and MIS11 by ~1.5 K. This result is supported by another recent SAT estimate ( 22 ) that uses deep ocean temperatures extracted from an oxygen isotope compilation ( 23 ). Furthermore, the warmer conditions shown for MIS5e and MIS11 are in good qualitative agreement with a 430,000-year sea-level reconstruction ( 24 ) that exhibits global sea-level high stands of 4.7 to 6.4 m above present-day values for both periods.

The SAT SDs amount to 1.86 K for the model-based reconstruction and 1.76 K for the proxy-based one. The global SAT difference between the LGM and the Holocene (represented here by the 10 ka B.P. estimate) amounts to ~5.0°C when derived from the paleoproxies and ~6.5°C for the model-based estimate. Both values are consistent with other combined model-proxy estimates ( 18 , 19 ) but slightly larger than in a recent multiproxy study ( 20 ) or as derived from a multimodel ensemble in combination with global proxy data ( 21 ).

Globally averaged SAT (K) and radiative forcing anomalies (W/m 2 ) for the last 784 ka and for the RCP8.5 scenario. ( A ) Global mean SAT anomalies (K) reconstructed from 14 long-term paleoproxies of SST (blue line) and from the transient model simulation (red line). Anomalies were calculated with respect to PI times. ( B ) Averaged, reconstructed global mean SAT anomaly (K, black line). Shading denotes uncertainty of ±2.12 K (see Materials and Methods). ( C ) CMIP5 ensemble mean projection for globally averaged SAT increase (K) with respect to PI mean state using RCP8.5 (black line) ( 6 ). Shading denotes ensemble SD. Dashed horizontal lines in (B) and (C) denote reconstructed maximum global mean SAT during the last 784,000 years (B) and its exceedance (C). ( D ) Radiative forcing anomalies (W/m 2 ) with respect to PI mean state. Cyan, dust forcing; red, greenhouse gas forcing; black, ice sheet forcing; magenta, orbital forcing; brown, sea-level forcing; blue, sum of radiative forcings (see also fig. S7). ( E ) Radiative forcing anomalies (W/m 2 ) with respect to PI mean state used for the CMIP5 RCP8.5 simulations ( 6 ).

On the basis of these methods, we can now derive two independent reconstructions of glacial-interglacial global mean SAT variations by using the first empirical orthogonal function mode (fig. S4) of the 14 SST reconstructions and the SST model output (at every ocean grid point), multiplied by the respective factors that account for the shortcomings and uncertainties discussed above (see Materials and Methods). The resulting paleo-SAT reconstructions, along with error bars that take into account systematic uncertainties (see Materials and Methods), show that the overall timing and amplitude of the two independent global SAT reconstructions are in good agreement ( Fig. 2A ) (correlation of r = 0.79), thus supporting the robustness of our methodology. Noticeable differences between the two reconstructions are visible for the glacial stages marine isotope stage 2 (MIS2), MIS6, and MIS12 and for the interglacial stages MIS5e and MIS11. For the warm phases MIS5e and MIS11, the proxy-based SAT reconstruction exceeds the warming obtained from the model-based one. This mismatch is a well-known feature ( 15 – 17 ) that can be partially explained by age model uncertainties of the SST proxies and potential biases of recorded SSTs toward summer ( 17 ). However, difficulties in simulating the response of vegetation cover, sea ice extent, and continental ice sheets to interglacial warming are also likely to contribute to this model-proxy mismatch.

An independent reconstruction for glacial-interglacial global mean temperature variability is based on the transient Earth system model simulation covering the past 784 ka (see Materials and Methods). Here, a different issue arises from the fact that the model simulation tends to underestimate the magnitude of temporal SST variability ( 14 ) (figs. S2 and S3) as a result of too low climate sensitivity. To quantify the magnitude of underestimation of SST, we compiled 63 globally distributed paleo-SST records that cover the period 140 to 10 ka B.P. ( Fig. 1 and table S2). All core data are interpolated to a regular time axis with a 1000-year resolution, and the SST anomaly of the splined data is calculated. In a second step, the simulated SST data from the transient model simulation for the same time period are subsampled for the 63 core locations. Finally, after averaging the SST anomalies over all 63 locations for proxy data and model, we calculate the model amplitude correction factor as the SD ratio of reconstructed and simulated anomalies (see Materials and Methods). Simulated SST variability can then be translated into global mean SAT by accounting for the underestimation with respect to the proxy data, followed by a conversion of global mean SST to global mean SAT (see Materials and Methods).

For a proxy-based reconstruction of global SAT variations, we selected a network of 14 long-term paleoproxy SST records ( Fig. 1 and table S1) based on the requirement that the SST reconstructions correspond to the 784-ka time period covered by our transient Earth system model simulation. After the data on the original age models were interpolated to a common time axis, a conversion factor was estimated, which accounts for the spatial subsampling and scales the dominant variability represented by the stack of 14 SST records to global mean surface temperature variability (see Materials and Methods). This factor is determined from an ensemble of eight models from the Paleoclimate Modeling Intercomparison Project Phase 3 (PMIP3) ( 13 ). By spatially subsampling the eight-member multimodel SST data for the Last Glacial Maximum (LGM) and preindustrial (PI) control simulations at the 14 core locations ( Fig. 1 ), we calculate the simulated average LGM-PI SST difference (see Materials and Methods). Subsequently, this value is compared to the simulated globally averaged LGM-PI SAT difference (fig. S1A). For all eight PMIP3 models, simulated SST anomalies averaged over the 14 locations are lower than the global SAT change by a factor of Ψ = 1.95 ± 0.22 (uncertainty ranges in this article refer to ±1 SD) (fig. S1B). Ψ will serve as our conversion factor to translate the paleo-SST data into a global SAT anomaly estimate.

Constraining the magnitude of future greenhouse warming is critical for risk assessment and adaptation strategies. Using our combined proxy/modeling approach based on the 784-ka SST data and applying it to the projected atmospheric CO 2 concentrations and radiative forcings results in SAT changes that overlap with the upper range of current CMIP5 RCP8.5 projections. The resulting paleodata-based estimate of surface warming by 2100 CE is ~16% higher than the CMIP5 ensemble mean projection. Our results suggest that a global surface temperature increase of 4 K by 2100 CE (compared to the PI reference value) is likely. Furthermore, our analysis demonstrates that in the case of unabated anthropogenic CO 2 emissions, the maximum long-term global mean SATs ever obtained during the past eight glacial cycles will very likely be surpassed within the 21st century. According to our results, the Earth’s specific equilibrium climate sensitivity is a function of the background climate with a substantially higher sensitivity during warm phases. It remains unclear whether this relationship will hold in climates substantially warmer than during the last eight glacial cycles. Therefore, we restrict our warming estimate to the 21st century and refrain from applying our method to potential greenhouse warming in a more distant future. Uncertainties associated with past temperature and radiative forcing data are still relatively large. These uncertainties directly affect the estimate of S, limiting a more accurate paleo-based projection of future greenhouse warming. However, our independent future warming estimates and their associated uncertainty ranges overlap with the CMIP5 RCP8.5 projections, thus providing further evidence for the climate model–based warming projections.

MATERIALS AND METHODS

Paleorecords of SST We used a total of 63 globally distributed paleoproxy records of SST. The SST reconstructions used to derive a proxy-based global SAT estimate were required to cover the period 784 to 10 ka B.P. to correspond to the time range of the model simulation (784 to 0 ka B.P.). The 14 long-term records are listed in table S1. To determine the underestimation of simulated glacial-interglacial SST variability, 49 records were used in addition to the long-term records. These 49 records were required to cover the previous glacial cycle (140 to 10 ka B.P.). They are listed in table S2. All 63 record locations are shown in Fig. 1.

Transient model simulation We used the Earth system model of intermediate complexity LOVECLIM (12). LOVECLIM is a coupled atmosphere–ocean–sea ice–vegetation marine carbon cycle model. However, the marine carbon cycle component was switched off for the present simulations. Because of its computational efficiency, the LOVECLIM model has been extensively used for transient paleoclimate studies of varying complexity and duration (11, 32–34). Our simulations were forced by transient values of atmospheric greenhouse gas concentrations, orbital parameters, and Northern Hemispheric ice sheet extent and height. Greenhouse gas concentrations were taken from ice core data following the methodology described by Timmermann et al. (11). Orbital parameters of precession, obliquity, and eccentricity follow those of Berger (35). Northern Hemispheric ice sheet extent and height were obtained from a recent simulation conducted with the CLIMBER Earth system model of intermediate complexity (36). The transient forcings for LOVECLIM were applied with an acceleration factor (37) of 5 so that 784,000 forcing years correspond to 156,800 model years. Orbital parameters and greenhouse gas concentrations were updated every model year. Ice sheet extent and height were updated every 200 model years (every 1000 forcing years). Transient changes in bathymetry and Antarctic ice sheets as well as meltwater pulses from retreating ice sheets were not taken into account. The simulation was conducted using an LGM bathymetry (38) to avoid internally generated, self-sustained oscillations in the Atlantic Meridional Overturning Circulation (39). The transient simulation was split into eight segments, each starting at an interglacial period. Every individual segment was spun-up for 5000 model years under constant climate boundary conditions and then run in parallel to save integration time. The individual segments overlapped by 10,000 forcing years (2000 model years). A complete temporal evolution of the model output was derived by concatenating the individual segments using a sliding linear interpolation for the overlapping periods.

Model-data comparison To document the performance of our model simulation, we validated the simulated SST using available paleoproxy-based SST reconstructions that cover our modeling period of the last 784 ka (see also table S1). Figure S2 compares all 14 reconstructed paleo-SST time series to the SST simulated in the model grid cell closest to the respective core location. For the sake of simplicity of handling the large amount of model data, only averages over 1000 forcing years were considered for simulated SST. Almost all SST records exhibited pronounced glacial-interglacial variability that was well captured by the simulated SST. The timing of temperature change on orbital time scales is in excellent agreement for most reconstructed and simulated SST data sets. However, the amplitude of glacial-interglacial SST change was underestimated by the model simulation for almost all core locations. This underestimation relative to the reconstructions may result from different sources, including the uncertainty in the climate sensitivity of LOVECLIM. Comparing the LGM-PI SST difference from a recent compilation (40) to our model result (fig. S6), it can be seen that areas that appear to exhibit warmer SSTs during the LGM are not reproduced by our simulation. A strong cooling in the Southern Ocean and the subpolar North Atlantic were shown in both the modeled and reconstructed SST. However, the amplitude of the cooling was underestimated by our model simulation, in agreement with the time series shown in fig. S2. To identify the leading spatiotemporal pattern of SST variations, we computed the first empirical orthogonal function (EOF1) for simulated and reconstructed SSTs. The 14 SST records were splined to a regular time axis of 1-ka resolution to match the temporal resolution of the time-averaged, simulated SST. For the simulated LOVECLIM SST, the entire model domain was taken. In the case of the reconstructed SST data, the EOF1 explained 54% of the joint variance. For the simulated SST, the explained variance of the first EOF amounted to 87%. In both cases, the EOF1 mode was characterized by a monopole pattern of different spatial magnitudes (fig. S4C). From the simulated data, it can be seen that the magnitude of the EOF1 loading is a function of latitude, reaching its maximum values in areas of large glacial-interglacial variability of sea ice extent, such as the subpolar North Atlantic and the Southern Ocean. Tropical regions exhibited a substantially smaller EOF1 loading. This result was supported by the EOF1 of the 14 cores. The principal components of the first EOFs (PC1s; fig. S4A) exhibited excellent agreement with regard to the timing of the joint variability that represents the glacial-interglacial SST change. Notable differences in the amplitudes of the PC1s occurred during the last interglacial period (around 125 ka B.P.) and during MIS12 (424 to 478 ka B.P.). The pronounced warming during the last interglacial seen in the PC1 of the reconstructed SST data was not reproduced by the simulated SST. The PC1 of the 14 paleorecords suggests that SSTs during MIS12 were lower than those during the LGM. The latter was not evident in the model data.

SAT reconstructions The main goal of our study was to obtain two independent reconstructions of SAT variations for the last 784 ka: one mainly based on paleorecords and the other one mainly stemming from the transient modeling results. Both sources of data are subject to uncertainties and shortcomings that need to be alleviated. One of the major caveats on the side of paleorecords is the spatial scarcity of SST data that cover this long time period. Being restricted to only 14 SST records, it needs to be quantified to what degree this limited number of records is representative of the amplitude and the timing of globally averaged SAT variations. For the climate model simulation, we need to address the fact that the low climate sensitivity of LOVECLIM [~2.30 K per CO 2 doubling (14)] leads to a systematic underestimation of SST variability with respect to the reconstructions. Method 1: Global SAT reconstruction based on paleoproxy SST. To test how representative the sample of 14 core locations is to capture the global mean SST and global mean SAT variability, we first addressed this question using pseudoproxies from the LOVECLIM simulation and from LGM and PI simulations conducted with an ensemble of coupled general circulation models. We spatially subsampled the simulated LOVECLIM SST at the 14 long-term record locations and computed the leading EOF. Comparing the PC1 derived from the subsampled pseudoproxy SST data and the global model SST data, it can be seen that they are virtually identical (fig. S4A), providing us with confidence that the 14 record locations are capable of capturing the timing of the global temperature evolution. Next, to evaluate to what degree the glacial-interglacial amplitude of global mean SAT change is captured by the sparse proxy network of the 14 locations, we took advantage of the output from eight models from the PMIP3 (13). In the context of PMIP3, numerous climate models were used to simulate the LGM and PI periods in coordinated time slice experiments. To determine the underestimation SST/SAT factor that would apply to an average derived from our specific 14 core locations, we calculated the mean LGM-PI SST difference in the PMIP model simulations by subsampling and then averaging the model LGM-PI SST difference (ΔSST′Model = SST LGM − SST PI ) at the 14 record locations (1)Then, the ratio is calculated for each model between the global mean LGM-PI SAT difference and the LGM-PI SST difference averaged over the 14 record locations (2)Figure S1A shows that the true simulated globally averaged LGM-PI SAT amplitude for the PMIP3 models is in the range of 2.70 to 5.41 K. With the average of the 14 core locations, the averaged SST range amounts to only 1.22 to 2.97 K. The underestimation factor Ψ Model varies between the models from 1.75 to 2.35 (fig. S1B). The eight-model average of Ψ amounts to 1.95 with an SD of 0.22 and will serve as our best estimate to take the effect of spatial undersampling and the difference between ocean and land temperatures into account. A reconstruction of the globally averaged SAT variation can then be derived from the first EOF of the 14 SST paleoproxy records (fig. S4B) and the PMIP3 ensemble mean Ψ factor (3)where represents the ith component of the first proxy EOF vector and PC1(t) is the corresponding principal component. The SD of Ψ corresponds to 11% of the total value. Together with half the maximum amplitude of the proxy-based SAT reconstruction of 4.90 K, the resulting uncertainty estimate amounts to ±0.54 K. An additional source of error is associated with the representativeness of our SST paleoproxy network. To test for the effect of this network uncertainty on the PC1 and thus our SAT estimate, we randomly chose 8 of the 14 records to derive a new PC1 (4)where x SST represents the vector of the 8 randomly chosen SST records out of 14. Subsequently, this new PC1 is projected on the EOF1 pattern and scaled by the underestimation factor Ψ, as described above (Eq. 3), and a new globally averaged SAT anomaly along with its deviation from the original SAT anomaly is derived. The SD between the permutation-derived and the original SAT reconstruction amounts to 0.96 K and represents our network uncertainty. Taking into account both sources of uncertainty, the overall error estimate for our proxy-based SAT reconstruction amounts to ±1.50 K. The proxy-based reconstruction of the globally averaged SAT variation is shown in Fig. 2A (blue line). Note that the paleo-SST records are subject to potential error sources that cannot be directly included in our calculation. These errors include seasonal biases of the respective SST proxy, analytical and calibration errors, and individual age model uncertainties. Method 2: Global SAT reconstruction based on paleomodel simulation. To reconstruct the globally averaged SAT variations using our transient model LOVECLIM simulation, two different factors have to be considered. First, a conversion from a globally averaged SST anomaly to a globally averaged SAT anomaly for our particular model needs to be derived. This can be done by calculating the ratio between the SDs (σ) of the two time series of the respective anomalies (5)Second, a potential underestimation of glacial-interglacial SST variations due to biases in the climate sensitivity needs to be taken into account. As mentioned above and shown in fig. S2, even though the temporal temperature evolution at the core locations is well captured by the transient model run, the glacial-interglacial amplitude seen in the paleo-SST records is underestimated by the simulated data. To determine this underestimation on a global scale, we used 63 globally distributed SST records covering the last glacial-interglacial cycle (namely, 140 to 10 ka B.P.; Fig. 1 and table S2). The SST records are again splined to a regular time axis of 1-ka resolution to match the temporal resolution of the time-averaged, simulated SST. Then, the simulated SST data were subsampled for the time period 140 to 10 ka B.P. at the model grid cells corresponding with the proxy record locations. Subsequently, for the average over all core locations (i = 1,63), the ratio of SDs was calculated from the splined SST proxy records and the anomalies of the simulated SST subsampled at the 63 core locations. Figure S3 shows a map of the local ratio Γ of the SDs (6) The ratio of the two SDs amounts to 1.97 and represents the best estimate for the underestimation of globally averaged glacial-interglacial SST variability by the model relative to the SST proxy data. We can then derive a model-based reconstruction of globally averaged SAT variations by multiplying the SST anomaly, reconstructed from the first EOF, with the two factors determined above (7)where N represents the number of model grid points. We assume a similar ±1.50 K error for the model-based SAT reconstruction as for the proxy-based one. The model-based reconstruction is shown in Fig. 2A. In the final step, the two independent estimates (proxy- and model-based) were averaged. For the time period 10 to 0 ka B.P. that is not covered by the proxy-based reconstruction, only the model-based estimate was taken into account. Given the uncertainty estimates of the individual reconstructions, the overall uncertainty of our combined SAT reconstruction amounts to ±2.12 K. The combined SAT reconstruction for the last 784 ka is shown in Fig. 2B. Figure S5 compares our averaged SAT reconstruction to other recent estimates of temperature anomalies. With regard to the temporal evolution of temperature anomalies, there is a good agreement between all estimates for the period 500 to 0 ka B.P. Notable mismatches in the phase relationships are visible for earlier periods, particularly for the recent SST stack (41). Glacial-interglacial temperature amplitudes for Antarctica data (30, 42) are about 1.4 times larger than for our global mean reconstruction, which points to a magnitude of Antarctic polar amplification that is in agreement with other estimates (31).

Calculation of radiative forcing Radiative greenhouse gas forcing is calculated following Hansen et al. (43) and using (8)using = 279 ppmv (ppm by volume), = 633 ppb (parts per billion), and N 2 Oref = 270 ppb. The time series of greenhouse gas concentrations are taken from ice core data over the past 784 ka (25, 44). The dust radiative forcing uses the logarithm of the dust flux recorded in the EPICA Dome C ice core (45) and assumes a PI/LGM radiative forcing difference of 1.9 W/m2 (1). The annual mean shortwave forcing over the past 784 ka was calculated directly from the transient LOVECLIM climate model simulation by focusing on shortwave changes occurring (i) as a result of planetary albedo changes over the ice sheets (9)where α p (t) represents the planetary albedo changes and 〈…〉 represents the annual mean. Note that LOVECLIM does not include a shortwave cloud feedback. Therefore, we can directly use the planetary albedo changes over ice-covered regions for the calculation of the forcing. The ice mask has a value of 1 if an ice sheet is present, and 0 otherwise. Q inc represents the incoming shortwave radiation. The typical LGM-to-PI difference for F SW,ICE (t) is −1.45 W/m2 (fig. S7). Note that the choice of albedo is critical when calculating radiative forcing anomalies. The glacial-interglacial change in surface albedo of ice sheet areas simulated by our model amounts to ~0.4. This value is in good agreement with a surface cover change from ice sheets (albedo of ~0.7) to a seasonal mix of vegetation and snow (annual mean albedo of ~0.3 to 0.4). However, the effect of changes in ice sheet cover on planetary albedo is substantially smaller compared to surface albedo. For planetary albedo, the simulated glacial-interglacial amplitude reaches only ~0.2, resulting in a smaller radiative forcing. Thus, using surface albedo to calculate the ice sheet effect on shortwave forcing results in an erroneously large forcing estimate. (ii) as a result of planetary albedo changes caused by sea level–induced inundation/exposure of non–ice-covered continental shelves (10)where land mask is initially calculated using high-resolution bathymetric data (etopo20) and then interpolated onto the T21 LOVECLIM grid, allowing for fractional land points. It is important to take the planetary albedo here and not the surface albedo changes. Note that the LOVECLIM simulation itself does not use time-varying bathymetry so that the applied sea-level forcing is not explicitly part of the LOVECLIM simulation. The typical LGM-to-preindustrial difference for F SW,SL (t) is −0.24 W/m2. (iii) as a result of vegetation changes in non–ice-covered and non–snow-covered regions outside the continental shelves (11) The snow mask has values of 1 if the annual mean snow depth is greater than 0.1 m, and 0 otherwise. “Land mask(t = 0)” represents the PI land-sea distribution obtained from the high-resolution bathymetry data set and then interpolated onto the T21 LOVECLIM grid. The typical LGM-to-PI difference for F SW,VG (t) is −1 W/m2. Mid-Holocene (10 ka) vegetation forcing values go up to 0.5 W/m2. This forcing is used to calculate the Charney sensitivity, but it is not included as a forcing in the comparison with the CMIP5 RCP8.5 scenario simulations because, in these simulations, vegetation is a feedback rather than a forcing. (iv) as a result of orbital forcing (12) Using the PI planetary albedo of LOVECLIM, we obtained a typical range of −0.2 to 0.5 W/m2 and fluctuations on an obliquity time scale, as a result of the hemispheric difference in PI planetary albedo. Incorrectly using only incoming shortwave radiation without consideration of the planetary albedo would lead to eccentricity scale variability. The total forcing F(t), expressed as the sum of the greenhouse and shortwave forcings (fig. S7), attains minimum glacial values of −6.5 W/m2 (assuming that vegetation changes are a feedback) and −7.6 W/m2 (assuming that vegetation changes are a forcing), relatively to the present, which is similar to the values derived from previous studies (46, 47) and about 1 W/m2 smaller than the estimates from previous reports (1, 2). With regard to providing an uncertainty estimate for our radiative forcing, we adopted the method and numbers presented by Köhler et al. (1). Here, a systematic 1σ error of ~12% of the maximum amplitude was used. Applying this estimate to our maximum amplitude of the radiative forcing results in uncertainty estimates of ±0.78 and ±0.91 W/m2. To calculate the Charney climate sensitivity [feedbacks include fast processes and forcings include slow Earth system processes, such as greenhouse gas changes, ice sheet changes, aerosol changes, and vegetation changes, or S GHG,LI,AE,VG , by Rohling et al. (27)], we treated sea ice and snow albedo changes as feedbacks rather than as forcings. To compare the paleo-based estimates of climate forcing and response with the CMIP5 model simulation, which simulate vegetation changes, we instead used a different approach: Changes in greenhouse gas concentrations, ice sheet albedo, and aerosol concentrations were treated as forcings, whereas vegetation changes were treated as feedbacks. Following the nomenclature by Rohling et al. (27), we determined S GHG,LI,AE . Our LOVECLIM simulation uses climatologically prescribed clouds, which means that the model does not have explicit cloud feedbacks. Given the large uncertainty of cloud feedbacks for LGM time slice experiments (48), this approach may be justified.

Specific equilibrium climate sensitivity The specific equilibrium climate sensitivity is defined as the equilibrated change in global mean SAT for a given change in radiative forcing. However, when calculating S for specific periods, dating uncertainties can result in large biases of S. To choose an appropriate fit for S, we first tested for the state dependence of our radiative forcing–SAT relationship. We calculated linear regressions between radiative forcing anomalies and global mean SAT anomalies for climate states colder and warmer than the long-term average, respectively. Using the mean value of the global mean SAT anomaly to distinguish between these states, we found 409 (375) data points for colder-than-average (warmer-than-average) periods. The linear regression slope for the colder-than-average period amounts to b cold = 0.53 ± 0.02 K W−1 m2. Warmer-than-average phases are characterized by a slope of b warm = 0.93 ± 0.04 K W−1 m2. To test our hypothesis that the two slopes are significantly different, we calculated a z score (49) based on the slope difference and the SD of the regression slopes (σ) (13) On the basis of this z score, we can conclude with very high confidence (>99.99%) that the difference between the two regression coefficients is statistically significant, pointing to a state dependence of the radiative forcing–SAT relationship with larger values of S during warmer-than-average phases. Accounting for this nonlinearity, we then applied a simple quadratic fit to the data. The advantage of a polynomial fit over the bilinear decomposition above is found in the fact that this fit takes into account all data points at the same time and warrants differentiability of the radiative forcing–SAT relationship in the entire domain of forcing values. Furthermore, it will allow us to calculate S for climate states that are separated from the average by at least 1 SD of the global mean SAT anomaly. The latter is normally prohibited by the scarcity of data and the large scatter in the reconstructions of radiative forcing and global mean SAT. The polynomial fit is shown in Fig. 3. The slope of the fit is subject to substantial change, steadily increasing for warmer background climate. Interpreting the (local) slope of the fit in terms of a specific equilibrium climate sensitivity is not straightforward. We referred to the detailed discussion by Köhler et al. (50). Here, we are mainly interested in deriving an average of S that is representative of warm climates and can be used for a global warming projection for the 21st century. Thus, we can use a slope average over a larger domain, avoiding uncertainties in the method of calculation (50) while still allowing for a state dependence. To distinguish cold and warm climates, we then used the SD of our SAT reconstruction. Warm (cold) phases are required to exhibit a global mean SAT above (below) 1 SD. Note that, according to our temperature reconstruction, the PI state of the current interglacial exhibits a temperature of almost 2 SDs above the average of global mean SAT. Thus, our constraint is justified to derive a value of S that is representative of the PI climate state. The different phases are indicated by the dashed horizontal lines in Fig. 3. In the second step, we averaged the slope of the quadratic fit for cold and warm phases, following Eq. 11 of Köhler et al. (50). Taking into account the systematic errors in the SAT reconstruction and the estimate of radiative forcing described above, the specific equilibrium climate sensitivity for cold climate (Scold) amounts to 0.41 to 0.55 K W−1 m2 (likely range, mean: 0.48 K W−1 m2). For warm climates, Swarm attains values of 1.16 to 1.47 K W−1 m2 (likely range, mean: 1.32 K W−1 m2). To calculate the equilibrium global mean SAT response to a change in atmospheric CO 2 , we used a radiative forcing anomaly of 3.7 W/m2 per CO 2 doubling (51). The resulting SAT response to a CO 2 doubling amounts to 1.78 K (likely range, 1.52 to 2.04 K) per CO 2 doubling for cold phases and 4.88 K (likely range, 4.29 to 5.44 K) per CO 2 doubling for warm phases.

Transient climate response until year 2100 One of the major goals of our study was to apply the derived Swarm to the radiative forcing anomalies estimated to occur as a consequence of human-induced CO 2 emissions. However, the anthropogenic warming of the climate system until the end of the 21st century is a transient process because a time period of <250 years is not sufficient to reach a new thermal equilibrium. In particular, ocean heat uptake and storage at deeper water layers slow down the SAT response to the anthropogenic perturbation of the radiative forcing (28). The effect of the ocean’s thermal inertia (N) can be approximated as N = κΔT, where ΔT describes the globally averaged SAT change and κ is the ocean heat uptake efficiency (28). For a given radiative forcing perturbation (F) and a given S(1/α), the transient global mean SAT response (ΔT) can then be estimated as (14) Here, we chose a multimodel mean estimate of the ocean’s thermal inertia of κ = 0.6 ± 0.2 W m−2 K−1 (28). Applying the mean value of κ = 0.6 W m−2 K−1 to our mean paleodata-based Swarm of 1.32 K W−1 m2 results in a transient climate response parameter (TCRP) of ~0.74 K W−1 m2 (Fig. 3). This amounts to about 56% of the value of Swarm and is in good agreement with a recent estimate for the transient climate response (27). The uncertainty in the ocean’s thermal inertia, together with the uncertainty in Swarm, results in a likely TCRP range of 0.60 to 0.93 K W−1 m2. This paleodata-based TCRP is now applied to the RCP8.5 forcing scenario until year 2100 CE, following the equation above for deriving an estimated global mean SAT response. The anthropogenic forcing results in a global mean SAT anomaly of 5.86 K by year 2100 with respect to PI values. The uncertainties in S and the ocean’s heat uptake efficiency as discussed above result in a likely range of 4.78 to 7.36 K for the global mean SAT anomaly. Comparing our paleo-based estimate of future warming to the multimodel ensemble mean projections of the CMIP5 (52), we found that our projection results in a slightly higher global mean SAT anomaly. The current Intergovernmental Panel on Climate Change (IPCC)/CMIP5 projection under the RCP8.5 scenario results in a global mean SAT increase of 4.84 K for the year 2100 (with respect to PI values). The corresponding multimodel ensemble values range from 3.42 to 6.40 K.