The latest incarnation of the CRUTEM land surface temperatures and the HadCRUT global temperatures are out this week. This is the 4th version of these products, which have undergone a number of significant changes over that time and so this is a good opportunity to discuss how and why data products evolve and what that means in the bigger scheme of things.

The paper describing the new CRUTEM4 product is in press at JGR (Jones et al, 2012), and makes a number of important observations. First, on the evolution of the CRU temperature data set, from CRUTEM1 back in the mid 1980s, which used a limited selection of station data and ‘in-house’ homogenization, to CRUTEM2 around 2003, CRUTEM3 in 2006, and now CRUTEM4 which has a wider data sources and relies much more on homogenization efforts from the National Met Services themselves.

Second, the paper goes into some detail about how the access to data, and reasons for the above changes, the history of homogenization efforts, and the current status of those efforts. Much of this is excellent background information that deserves to be more widely known. For instance, the timing of the CLIMAT (monthly average reports) available almost immediately, MCDW (Monthly Climate Data for the World) after a few months, and the World Weather Records (WWR) (once a decade, last one issued for the 1990s, next one due soon) has a much larger influence on spatial coverage and station density than would be ideal.

The third point is how this product differs from similar efforts at GISTEMP, NCDC, and (though not mentioned) the Berkeley Earth project. The basis for GISTEMP and NCDC is GHCN for the majority of their records, so there is a lot of overlap (95% or so), but there are big differences in how they deal with grid-boxes with missing data. GISTEMP interpolates anomalies from nearby grid points, Berkeley uses kriging, while NCDC and CRUTEM estimate global means only using grid boxes with data. Since many missing data points in CRUTEM3 were in the high latitudes, which have been warming substantially faster than the global mean, this was a source of a low bias in CRUTEM3 (and HadCRUT3), when these data products were used to estimate global mean temperature trends. The increase in source data in CRUTEM4, goes some way to remove this bias, but it will likely still remain (to some extent) in HadCRUT4 (because of the sea ice covered areas of the Arctic which are still not included). Another improvement is in how the error bars are being estimated – due to data sparsity, autocorrelation, structural uncertainty, and assumptions in the synthesis.

The CRUTEM4 data is available here and here, along with links to the full underlying raw data (minus Poland) and the code used to process that data (this is not quite finished as of when this post went live). This is a big step forward (but like the release of the code for GISTEMP a few years ago, it is unlikely to satisfy the critics).

So what does the CRUTEM4 data look like?

Overall, changes are small (see figure to the right, showing the trend (°C/60 years) for each CRUTEM3 (top), CRUTEM4 (middle) and the difference in their trends (bottom)). There is no change to the big picture of global warming in recent decades, nor in its regional expression. Where there are noticeable changes, it is in coverage of high latitude regions – particularly Canada and Russia where additional data sources have been used to augment relatively sparse coverage. Given the extreme warmth of these regions (and the Arctic more generally) in recent years, combined with the CRUTEM procedure of only averaging grid boxes where there is data (i.e. no interpolation or extrapolation), this extra coverage makes a difference in the trends.

There will of course be an impact on the combined ocean and land temperature record, HadCRUT4. This incorporates (and bring up-to-date) the HadSST3 product that we discussed last year. The paper describing HadCRUT4 is also in press (Morice et al, 2012).

As expected, the changes (a little from both data sets) lead to a minor rearrangement in the ordering of ‘hottest years’. This is not climatologically very significant – the difference between 1998 and 2010 is in the hundredths of a degree, and most of the attribution work on recent climate changes is looking at longer term trends, not year to year variability. However, there is now consistency across the data sets that 2005 and 2010 likely topped 1998 as the warmest years in the instrumental record. Note that neither CRUTEM4 nor HadSST3 are yet being updated in real time – they only go to Dec 2010 – though that will be extended over the next few months.

There are a number of issues that might need to be looked at again given these revisions. Detection and attribution efforts will need to be updated using CRUTEM4/HadCRUT4, though the changes are small enough that any big revisions are extremely unlikely. Paleo-reconstructions that used CRUTEM3 and HadCRUT3 as a target, might be affected too. However, the implications will be more related to the mid-century and 19th C revisions than anything in the last decade.

We can make a few predictions though:

We can look forward to any number of contrarians making before and after plots of the data and insinuating that something underhand is going on. Most of the time, they will never link to the papers that explain the differences. (This is an easy call because they do the same thing with GISTEMP all the time). (Yup).

Since the “no warming since 1998/1995/2002” mantra is so seductive to people who like to focus on noise rather than signal, the minor adjustments in the last decade will attract the most criticism. Since these fixes really just bring the CRU product into line with everyone else, including the reanalyses, and are completely unsurprising, we can expect many accusations of groupthink, deliberate fraud and ‘manipulation’. Because, why else would scientists agree with each other? ;-)

The GWPF will not update their logo.

Joking aside, there are some important points to be made here. First and foremost is the realisation that data synthesis is a continuous process. Single measurements are generally a one-time deal. Something is measured, and the measurement is recorded. However, comparing multiple measurements requires more work – were the measuring devices calibrated to the same standard? Were there biases in the devices? Did the result get recorded correctly? Over what time and space scales were the measurements representative? These questions are continually being revisited – as new data come in, as old data is digitized, as new issues are explored, and as old issues are reconsidered. Thus for any data synthesis – whether it is for the global mean temperature anomaly, ocean heat content or a paleo-reconstruction – revisions over time are both inevitable and necessary. It is worth pointing out that adjustments are of both signs – the corrections in the SST for bucket issues in the 1940s reduced trends, as do corrections for urban heat islands, while correction for time of observation bias in the US increased trends, as does adding more data from Arctic regions.

Archives of data syntheses however, are only really starting to be set up to reflect this dynamic character – more often they are built as if synthesis just happens once and never needs to be looked at again. There is still much more work to be done here.

But even while scientists work on ironing out the details in these products, it’s worth pointing out what is robust. All data sets show significant warming over the 20th Century – regardless of whether the raw data comes from the ocean, the land, balloons, ice melt or phenology, and regardless whether the data synthesis is performed by the scientists in Japan, Britain, the US, individual bloggers or ‘sceptics’.

References