This post is related to the substantive results of the new Marvel et al (2015) study. There is a separate post on the media/blog response.

The recent paper by Kate Marvel and others (including me) in Nature Climate Change looks at the different forcings and their climate responses over the historical period in more detail than any previous modeling study. The point of the paper was to apply those results to improve calculations of climate sensitivity from the historical record and see if they can be reconciled with other estimates. But there are some broader issues as well – how scientific anomalies are dealt with and how simulation can be used to improve inferences about the real world. It also shines a spotlight on a particular feature of the IPCC process…

One of the most intriguing differences between IPCC AR5 (section 10.8) and previous reports was the bottom line conclusion on equilibrium climate sensitivity (ECS). Compared to AR4, they moved the lower limit for the likely range from 2ºC to 1.5ºC and instead of suggesting a ‘best estimate’ of ~3ºC, they didn’t feel as if they could give any best estimate at all, leaving an impression of a wide (perhaps uniform) distribution of likelihoods from 1.5 to 4.5ºC. (NB. If you want a good background on climate sensitivity, David Biello’s article at Scientific American is useful or read our many previous posts on the topic).

The reason for this change was a series of new papers (particularly Otto et al, 2013 and Aldrin et al, 2012) which focused on sensitivity constraints from the historical period (roughly 1850 to the present). For a long time, this method had such large uncertainties that the resulting constraints were too broad to be of much use. Two things have changed in recent years – first, the temperature changes over the historical period are now more persistent, and so the trend in relation to the year-to-year variability has become more significant (this is still true even if you think there has been a ‘hiatus’). Secondly, recent papers and the AR5 assessment have made a case that the uncertainty in net aerosol forcing (a cooling) can be reduced from previous estimates. An increase in signal combined with a decrease in uncertainties should be expected to lead to sharper constraints – and indeed that is the case.

However, these papers make a number of simplifying assumptions. Most often, they approximate the energy balance of the Earth as a simple one-dimensional linear equation connecting forcing and global mean temperature response. This implies that: a) the approach to equilibrium in forcing/temperature space is linear – that sensitivity doesn’t change in time or in response to patterns of change, and b) that all forcings are, in a basic sense, equivalent. Given those assumptions, looking at the forcing over a long-enough multi-decadal period and seeing the temperature response gives an estimate of the transient climate response (TCR) and, additionally if an estimate of the ocean heat content change is incorporated (which is a measure of the unrealised radiative imbalance), the ECS can be estimated too.

[A quick note on terminology: All constraints have to be based on observations of some sort (historical trends, specific processes, paleoclimate etc.) and all constraints involve models of varying degrees of complexity to connect the observation to the sensitivity metric. People who only describe constraints based on the historical changes as ‘observational’ while every thing else is supposedly ‘model-based’ are just playing rhetorical games.]

Rather than simply assume that one subset of constraints are superior, it is important to investigate why there might be a systematic difference between methodologies and the coupled GCMs are excellent tools for that. It’s important to be clear how the GCMs are being used here – explicitly, they are being used as ‘analogs’ for the real system. It is a set up where we can calculate everything and know how all the diagnostics relate to each other across multiple scenarios. Thus – in this system – we can assess how well simple assumptions for the energy balance approach work out in a more complex system. The claim is not being made that the GCM is exactly a match to the real world, but rather that if a simplifying assumption made about the real world doesn’t hold up in a GCM, it probably won’t hold up in the real world either.

Results in a perfect model world

Back to the paper. Marvel et al use a series of simulations that reran the historical period, but with only one forcing at a time. For instance, simulations were run that only used the changes in volcanic forcing, or in land use or in tropospheric aerosols. These counterfactual simulations give a wide variety of possible ‘histories’ where we can apply the methodology of Otto et al (for instance) to see whether it gives the ‘right’ answer. Remember, since this is a model, we know exactly what the right answer is for the TCR and ECS.

The interesting thing is that the answers for the different ‘histories’ aren’t right. Indeed, for each of the forcings, there is a systematic error in the answer they give. You can see this in the picture below. If all the answers were correct each of the colored lines (which correspond to a specific forcing) would all have the same slope. They obviously don’t.





Figure 1: The relationship between forcing and global mean temperature response for seven historical sets of experiments. Each color corresponds to a different experiment (AA is aerosols only, LU is land use only, GHG is greenhouse gases only, Oz is ozone forcing only, Sl is solar only, Vl is volcanic only, and Historical is all forcings together). Each dot is for an ensemble mean decadal-average and the filled vs open circles show a technical distinction that depends on how the forcings are specifically defined. The slope of each line is the estimated transient climate response (TCR).



This result is the same whether you look at the TCR or the ECS (using the ocean heat content changes in the same simulations), or whether you use instantaneous forcings or effective radiative forcings. In each case, estimates of the (known) TCR and ECS using the Otto et al methodology significantly underestimate the actual values from the model.

The reasons seem to be related to the spatial distribution of the forcings with respect to the forcing from CO 2 which is relatively homogeneous globally. Aerosols and ozone are mostly important in the northern hemisphere, land use forcing is confined to land etc. That affects how quickly the land and ocean temperatures respond and make a different to the projection of the forcing onto the ocean, and hence the ocean heat content change.

Back to the real world

We can go further. If we characterise the error in the slopes in Fig. 1 by an ‘efficacy’ for each forcing (the ratio of the slope that it ought to have to be accurate to the slope it actually has), we can take the forcing estimates used in Otto et al (and other papers), adjust them for this bias, and redo the calculations for the real world. Of course, the adjustment we are making has it’s own uncertainty (this is derived from a single model after all) and that needs to be taken into account too.

Since the sensitivity estimates using the Otto et al method in the model world are biased low, using the estimated efficacies in the real world means that the sensitivities from the adjusted methodologies are going to be increased, and indeed that’s exactly what happens.





Figure 2. Joint probability density function for the climate sensitivity (TCR and ECS) and how they shift when you take the efficacies into account. In each case (which vary according to the estimated forcings) the mean of the distribution shifts to significantly higher numbers (follow the arrows!).



How big an effect is this? Well, for Otto et al (2013), the estimated TCR without taking account of the efficacies is 1.3ºC (which is on the low side of other estimates), but with this effect accounted for, it is closer to 1.8ºC. The results are similar for the other studies we looked at and for the ECS values too (which go from 2ºC to 2.9ºC). The net effect is to bring these studies directly into line with constraints from other methodologies.

And what about IPCC?

One of the key principles of the IPCC process is that it can only assess the published literature. This leads to a bit of an odd scramble at deadline time (which is mostly pointless IMO, but YMMV). But of course it also means IPCC can’t really adjudicate on ’emerging’ issues that haven’t really been resolved in the literature. For things that are just beginning to be talked about and where conflicting results exist, they are left with only one option, which is to simply describe the range of results with some of the caveats. This is (in my view) what occurred in AR5 with respect to the transient constraints on sensitivity and also perhaps on discussions of the ‘hiatus’. Both of these issues have been discussed in the literature since AR5, and I doubt very much that the relevant texts would be anything like as ambiguous if they were to be rewritten now.

In particular, with the publication of Marvel et al (2015) (and also Shindell (2014)), the reason for the outlier results in Otto et al and similar papers has become much clearer. And once those reasons are taken into account, those results no longer look like such outliers – reaffirming the previous consensus and reinforcing the idea that there really is a best estimate for the sensitivity around 3ºC.

One final point. This is not an example of ‘groupthink’ surpressing legitimate debate in estimating sensitivity – rather it’s the result of deeper explorations to examine why a certain type of estimate was an outlier, which had a conclusion that ended up reinforcing the existing, wider, body of knowledge. Surprising as it may seem to some people, this is actually the normal course of affairs in science. Mainstream conclusions are hard to shift because there is a lot of prior work that supports them, and single studies (or as here, a group of similar studies) have their work cut out to change things on their own. As with supposedly faster-than-light neutrinos, most anomalies eventually get resolved. But not necessarily on the IPCC timetable.

References