Some ancient history

Fifteen to twenty years ago, Michael Mann and colleagues wrote a few papers claiming that current warming was unprecedented over the last 600 to 2000 years. Other climate scientists described Mann’s work variously as crap, pathetic, sloppy, and crap. These papers caught the interest of Stephen McIntyre and this led to the creation of his Climate Audit blog and the publication of papers pointing out the flaws in these hockey stick reconstructions. In particular, Mcintyre and his co-author Ross McKitrick showed that the method used by Mann and colleagues shifted the data in such a way that any data sets that showed an upward trend in the 20th century would receive a stronger weighting in the final reconstruction. With this method, generation of a hockey-stick shape in the temperature reconstruction was virtually guaranteed, which M&M demonstrated by feeding in random numbers to the method.

As climate scientist Rob Wilson put it in an email,

The whole Macintyre [sic] issue got me thinking about over-fitting and the potential bias of screening against the target climate parameter… I first generated 1000 random time-series in Excel… The reconstructions clearly show a ‘hockey-stick’ trend. I guess this is precisely the phenomenon that Macintyre has been going on about.

But the climate science community admitted nothing in public. One climate scientist wrote one of the most revealing emails:

-How should we deal with flaws inside the climate community? I think, that “our” reaction on the errors found in Mike Mann’s work were not especially honest.

Ouch.

This is all ancient history, and the issue is discussed in detail in Andrew Montford’s book, The Hockey Stick Illusion.

Two new papers

So, I felt a strange sense of deja vu or Groundhog Day when I heard from the BBC that ‘new’ research had found that current warming was unparalleled in 2,000 years. The two papers are written by the PAGES2k team, headed by Raphael Neukom. They are Consistent multidecadal variability in global temperature reconstructions and simulations over the Common Era and No evidence for globally coherent warm and cold periods over the preindustrial Common Era (both paywalled). They use data from a 2017 paper by themselves, A global multiproxy database for temperature reconstructions of the Common Era.

The PAGES2k data has come in for a lot of criticism at Climate Audit. There are numerous problems, such as inconvenient data being deleted or used upside-down, or the use of ‘stripbark’ data, against the recommendation of an NAS panel.

The new papers are quite open about screening for ‘temperature-sensitive’ proxies. From the “Consistent” paper:

For the reconstructions presented in the main text, we use the subset of records selected on the basis of regional temperature screening and to account for false discovery rates (R-FDR subset). This screening reduces the total number of records from 692 to 257, but increases the GMST reconstruction skill for most methods and skill metrics.

(GMST is global mean surface temperature). That’s a fairly drastic reduction in the number of proxy records. Tucked away in Fig 17 of the Supplementary Information are graphs using the “full unscreened PAGES 2k proxy matrix”, which have a less sticky shape than those in the main paper.

But as is often the case in climate science, it’s worse than we thought. The so-called “unscreened” PAGES2k proxies were in fact already screened, with a substantial culling of tree-ring data! This is from the 2017 “Global multiproxy” paper:

So two rounds of proxy screening have been carried out. Furthermore, in one of the methods they use, the proxies are “weighted by their non-detrended correlation with the GMST reconstruction target over the calibration period”, a further technique that helps to ensure that a hockey-stick will be produced.

Steve McIntyre is on the case, see this twitter thread. The first tweet refers to this weighting issue, and number 4 in the sequence mentions the “superscreening” point. The last (at time of writing), number 23, shows how drastic the screening out of North American tree-ring proxies is in the latest papers.

There is also some decline-hiding going on: in one of the Canadian datasets used, when the time series showed the ‘divergence problem’ (heading downwards when temperature goes upwards), the divergent parts of the series were just deleted. See this blog post by Shub with the relevant parts of the paper highlighted. Again, if you do this, when you combine all the data to get an overall picture, you will get a stronger hockey stick effect.

The screening fallacy

The problems with proxy screening were very widely discussed a few years ago at several climate blogs, including Lucia’s Blackboard, Jeff’s Air Vent, and Climate Audit. But some of you may have forgotten, and some of you may be too new to the game, so here’s a refresher. I’ll try to explain it as simply as possible, so that even a BBC environment correspondent could understand it.

Suppose that you have a number of time series, covering the last 2000 years, coming from annual tree ring measurements or anything else. You think that they might be related to temperature, that is, be a ‘proxy’ for temperature. Or at least some of them might be. How can you check? Well, you have a reasonable idea of the temperature rise over the last 100 years from thermometer measurements, so you can compare each series against that, to check if it matches. Then you might discard those that don’t fit well (screening) or assign a weighting to each one according to how good the match is.

This sounds at first glance like a good idea. But there’s a problem. It’s actually a lousy idea. With this method, your data sets could be just random noise, and you’d still get a hockey stick result! A picture is worth a thousand words here. This one was posted by commenter “Jeez” at Climate Audit in 2012 (in the case discussed there, the paper, Gergis et al, was withdrawn after claims that they had avoided the screening fallacy were shown to be false).

Suppose that you have six time series, as shown in the first six diagrams. The first four go down-up, up-down, down-down, up-up, and the next two are flat. You carry out your screening test, and you find two of them that match fairly well with the 20th century temperature record – that’s the two with the red circles. Then you average those two, and you get the bottom picture – the hockey stick that you wanted! The two oppositely directed parts at the left-hand end (grey lines) average out to the flat (dotted) line. You can add in the two flat ones as well if you like, they won’t make any difference to the picture. In climate-science-speak, your reconstruction is “robust”!