2.1 Traditional Energy‐Balance Framework

Per equation 2, ECS requires estimates of F 2xCO2 and λ. We use estimates of F 2xCO2 from fixed sea surface temperature and sea ice experiments from 10 global climate models that submitted output to the Precipitation Driver Response Model Intercomparison Project (Myhre, Forster, et al., 2017). They estimate F 2xCO2 to be normally distributed with a mean of 3.69 W/m2 and a standard deviation of 0.13 W/m2.

We estimate λ from observations of R and T S . Observations of R come from the Clouds and the Earth's Radiant Energy System (CERES) Energy Balanced and Filled product (ed. 4, Loeb et al., 2018) and cover the period March 2000 to July 2017. Estimates of T S come from the European Centre for Medium Range Weather Forecasts Interim Re‐Analysis (Dee et al., 2011). In these calculations, monthly and globally averaged anomalies are used, where anomalies are deviations from the mean annual cycle of the data.

Given these data, we calculate λ in two ways, both based on equation 1. First, we use estimates of effective radiative forcing F over the CERES period and calculate λ as the slope of the regression of R‐F versus T S . We use standard regressions in this paper—an ordinary least squares fit, with R‐F as the dependent variable and T S as the independent variable (Murphy et al., 2009). The forcing is based on the IPCC AR5 forcing time series, revised and extended in the following ways. Forcing from CO 2 , N 2 O, and CH 4 has been replaced by calculating new forcing time series using concentrations from the National Oceanic and Atmospheric Administration/Earth System Research Laboratory (www.esrl.noaa.gov/gmd/ccgg/trends/) with updated formulae to convert mixing ratios to forcing (Etminan et al., 2016). Other forcing components match IPCC AR5 through 2011 and have been extended to July 2017. For aerosols and ozone, the multimodel mean forcing from Myhre, Aas, et al. (2017) is used. For volcanoes, the forcing from Andersson et al. (2015) is taken from their Figure 4, beginning in 2008. Solar forcing after 2011 is derived from Solar Radiation and Climate Experiment data (Lean et al., 2005). Other minor forcing terms are estimated using the relative change in forcing from 2011 to 2017 from the RCP4.5 scenario (Meinshausen et al., 2011).

Uncertainty in forcing is estimated using radiative forcing uncertainties from 2015. We take the 5–95% range for each of the 14 different forcing terms in 2015 and turn this into a fractional range by dividing by the median 1750–2015 forcing estimates. This fractional uncertainty is Monte Carlo sampled for each forcing term independently. These fractions are then multiplied by the relevant forcing time series and summed to create 10,000 different realizations of the time series of total radiative forcing. The average forcing time series during the CERES period is plotted in Figure S1 in the supporting information.

We then estimate a distribution of λ using Monte Carlo sampling. We start by subtracting the 10,000 forcing time series from the observed R time series to generate 10,000 estimates of R‐F. Then we repeat the following process 500,000 times: (1) randomly select an R‐F time series, (2) randomly subsample it and the observed T S time series, with replacement, and (3) regress the sampled R‐F and T S data sets to obtain an estimate of λ. The number of samples taken is set by the number of independent pieces of information in the time series, as estimated by equation (6) of Santer et al. (2000). The original data set contains 209 months; we estimate that there are ~100–120 independent samples due to autocorrelation in the time series.

In the second approach, we assume forcing changes linearly over the CERES time period and account for it by detrending R and T S time series. We do this by subtracting off the linear trend of each time series estimated using a least squares regression. We then assume that R detrended = λ T S,detrended , and we calculate λ by regression. The distribution of λ is estimated by randomly sampling 500,000 times (with replacement) the detrended R and T S time series, with each resampled data set providing one estimate λ. As with the previous estimate, we account for autocorrelation by reducing the number of samples taken, using equation (6) of Santer et al. (2000). Plots of R, T S , and F can be found in section S1 of the supporting information.

Distributions of λ for the two approaches are both quite wide (Figure 1a), with values of −0.51 ± 0.64 and −0.81 ± 0.65 W/m2/K for the R‐F and detrended calculations, respectively (uncertainties are 5–95% confidence intervals). The two estimates of λ reflect different ways of handing forcing, and they show that different approaches yield similar distributions for λ. These distributions are similar to those estimated as the uncertainty of ordinary least squares regressions of R‐F versus T S (−0.52 ± 0.56 W/m2/K) and detrended R versus detrended T S (−0.82 ± 0.64 W/m2/K). Our sign convention is that fluxes are downward positive, so a negative λ means that a warmer planet radiates more energy to space, a necessary requirement for a stable climate.

Figure 1 Open in figure viewer PowerPoint λ iv,obs and Θ iv,obs (W/m2/K); (b) scatterplot of R‐F (W/m2) versus T S (K), the dashed line is a least squares fit; (c) same as Figure T A (K). (a) Distribution ofand Θ(W/m/K); (b) scatterplot of(W/m) versus(K), the dashed line is a least squares fit; (c) same as Figure 1 b, but the regression is against(K).

The extreme width of the λ distributions is a consequence of scatter in the relationship between R‐F and T S (Figure 1b, also shown by Spencer & Braswell, 2010; Xie et al., 2016), which is due to both weak coupling between the surface and ΔR (Dessler et al., 2018) and weather noise. This means that our observational estimate of λ is quite uncertain, with almost all of the uncertainty coming from month‐to‐month variability in the R time series. Switching to another temperature data set, such as Modern‐Era Retrospective Analysis for Research and Applications, Version 2 (Gelaro et al., 2017), or using only the median forcing, yields very similar distributions. Systematic errors in the CERES time series are small; the data are stable to be better than 0.5 W/m2/decade (the stability of the shortwave is 0.3 W/m2/decade [Loeb et al., 2007], and the stability of longwave is 0.15 W/m2/decade [Susskind et al., 2012]). Because we are regressing R versus temperature, spurious trends in the data have little impact on our analysis (Dessler, 2010).

The distributions of λ plotted in Figure 1a are derived mainly from the response to interannual variability (Figure S3), so we will refer to them hereafter as λ iv . The λ in equation 2, however, is the climate system's response to forcing from doubled CO 2 (hereafter λ 2xCO2 ), so we cannot simply plug λ iv into equation 2 to derive ECS. In fact, this disconnect between what we can measure (λ iv ) and what is required to calculate ECS (λ 2xCO2 ) is one reason scientists have largely avoided using interannual variability to infer ECS.

(3) λ iv,obs is the observed value (from Figure λ iv /λ 2xCO2 is a transfer function that converts λ iv,obs into the required value λ 2xCO2 . We estimate this transfer function using models that submitted required output to the fifth phase of the Coupled Model Intercomparison Project (CMIP5) (Taylor et al., 2012 λ iv is derived from the models' control runs, in which climate variations arise naturally from internal variability. To facilitate comparison with the observations, as well as avoid any issues with long‐term drift, we first break each control run into 16‐year segments and calculate monthly anomalies of ΔR and ΔT S during each segment, where anomalies are deviations from the average annual cycle of each 16‐year period. We expect these model segments to contain the same types of climate variations that are in the observations (e.g., weather noise and ENSO). Then, we calculate λ iv for each segment as the slope of the regression of ΔR versus ΔT S for that segment. Finally, we average the segments' values of λ iv to come up with a single value of λ iv for each model (Table We therefore modify equation 2 to account for this:whereis the observed value (from Figure 1 a), mainly the response to interannual variability, while the ratiois a transfer function that convertsinto the required value. We estimate this transfer function using models that submitted required output to the fifth phase of the Coupled Model Intercomparison Project (CMIP5) (Taylor et al.,). The numeratoris derived from the models' control runs, in which climate variations arise naturally from internal variability. To facilitate comparison with the observations, as well as avoid any issues with long‐term drift, we first break each control run into 16‐year segments and calculate monthly anomalies of Δand Δduring each segment, where anomalies are deviations from the average annual cycle of each 16‐year period. We expect these model segments to contain the same types of climate variations that are in the observations (e.g., weather noise and ENSO). Then, we calculatefor each segment as the slope of the regression of Δversus Δfor that segment. Finally, we average the segments' values ofto come up with a single value offor each model (Table S1 ).

The CMIP5 archive does not include doubled CO 2 runs, but it does have abrupt 4xCO 2 runs from which we can estimate λ 4xCO2 . λ 4xCO2 is calculated from these runs using the Gregory et al. (2004) method: we regress all 150 years of annual average R versus annual average T S and take the resulting slope as an estimate of λ 4xCO2 , where R and T S are deviations from the preindustrial control run.

λ 2xCO2 ≈ λ 4xCO2 , then we can rewrite equation (4) If we assume that, then we can rewrite equation 3 as

Recent work suggests that λ 4xCO2 is less negative (i.e., implying a higher ECS) than λ 2xCO2 (Armour, 2017; Proistosescu & Huybers, 2017). On the other hand, we use all 150 years of the 4xCO 2 runs to estimate λ 4xCO2 , which tends to produce values that are too negative (Andrews et al., 2015; Armour, 2017; Rose & Rayborn, 2016; Rugenstein et al., 2016). These two errors tend to cancel, but how much of a bias is left—and in which direction—remains an uncertainty in this analysis. The CMIP5 ensemble's distribution of λ iv /λ 4xCO2 is plotted in Figure 2; it has an average of 0.81 and a standard deviation of 0.34.

Figure 2 Open in figure viewer PowerPoint Distribution of λ iv /λ 4xCO2 from 25 CMIP5 models; the black dashed line is the mean of the distribution. See methods for description of how the value is calculated in each model. CMIP5 = fifth phase of the Coupled Model Intercomparison Project.

We then use a Monte Carlo approach to estimate ECS. We produce 500,000 estimates of ECS by randomly sampling the distributions of F 2xCO2 , λ iv,obs (Figure 1a), and λ iv /λ 4xCO2 (Figure 2) and plugging them into equation 3; negative ECS values or values greater than 10 K are viewed as physically implausible and thrown out (sensitivity to the 10‐K threshold is shown in Table 1). We produce two ECS distributions—one using λ iv,obs from the R‐F calculation and one using λ iv,obs from the detrended calculation. The ECS distributions (Figure 3) have 17–83% confidence intervals (corresponding to the IPCC's likely range) of 2.5–7.0 K and 2.0–5.7 K for the R‐F and detrended calculations, respectively. The modes are 3.0 and 2.4 K, while the medians are 4.2 and 3.3 K.

Table 1. ECS Values From the λ Runs Run Mean Mode Median 5–95% 17–83% % < 2 % > 4.5 all‐Lambda‐1 4.63 2.98 4.22 1.7–8.8 2.5–7.0 6 32 all‐Lambda‐1‐f 4.43 2.85 3.99 1.6–8.7 2.3–6.8 8 30 all‐Lambda‐1‐f_20‐150 4.59 2.98 4.19 1.6–8.9 2.4–7.0 7 31 all‐Lambda‐1_8K 4.17 2.98 3.95 1.7–7.3 2.4–6.1 6 25 all‐Lambda‐1_12K 5.00 2.98 4.41 1.8–10.2 2.6–7.7 6 36 all‐Lambda‐2 3.78 2.44 3.29 1.4–8.0 2.0–5.7 15 26 all‐Lambda‐2_8K 3.52 2.44 3.19 1.4–6.8 2.0–5.2 15 21 all‐Lambda‐2_12K 3.97 2.44 3.35 1.4–8.8 2.0–6.0 15 28 good‐Lambda‐1 4.20 2.71 3.73 1.6–8.4 2.3–6.3 9 28 good‐Lambda‐2 3.66 2.31 3.19 1.4–7.7 1.9–5.4 16 24 good‐Lambda‐1‐f_20‐150 4.18 2.71 3.72 1.4–8.5 2.2–6.4 10 28 good‐Lambda‐1‐f 3.98 2.44 3.48 1.4–8.3 2.1–6.0 12 25

Figure 3 Open in figure viewer PowerPoint λ iv,obs from the R‐F regression, (b) calculated using λ iv,obs from the detrended regression. 17th %ile and 83rd %ile are 17th and 83rd percentiles, corresponding to the IPCC's likely range. ECS = equilibrium climate sensitivity; IPCC = Intergovernmental Panel on Climate Change. Distributions of ECS (K) using the traditional energy balance framework (equation 4 ). (a) Calculated usingfrom theregression, (b) calculated usingfrom the detrended regression.andare 17th and 83rd percentiles, corresponding to the IPCC'srange. ECS = equilibrium climate sensitivity; IPCC = Intergovernmental Panel on Climate Change.

Overall, our calculated ECS distributions overlap substantially with the IPCC's range, although our distributions are shifted to higher values: we see a ~30% chance that ECS exceeds 4.5 K, while the IPCC assigns a 17% chance. And we see less support for low values of ECS: the chance of an ECS below 2 K is 6–15%, while the IPCC assigns a 17% chance that is below 1.5 K.