As many readers are probably aware, there has been an important new posting at Climate Audit about the Yamal affair. This posting is an attempt to set out the whole story of Yamal. It reworks an article I did in 2009 and incorporates new developments since that time. I hope readers find it useful. I have also prepared a Kindle version of the post, for which there is a small charge - click here:

Please also consider hitting the tip jar.

The story of Michael Mann's Hockey Stick reconstruction, its statistical bias and the influence of the bristlecone pines is well known. Steve McIntyre's research into the other reconstructions of the temperatures of the last millennium has received less publicity, however. The story of the Yamal chronology may change that.

The bristlecone pines that created the shape of the Hockey Stick graph are used in nearly every millennial temperature reconstruction around today, but there are also a handful of other tree ring series that are nearly as common and just as influential on the results. Back at the start of McIntyre's research into the area of paleoclimate, one of the most significant of these was called Polar Urals, a chronology first published by Keith Briffa of the Climate Research Unit (CRU) at the University of East Anglia. At the time, it was used in pretty much every temperature reconstruction around. In his paper, Briffa made the startling claim that the coldest year of the millennium was AD 1032, a statement that, if true, would have completely overturned the idea of the Medieval Warm Period. It is not hard to see why paleoclimatologists found the series so alluring.

Yamal saves the day

Keith BriffaSome of McIntyre's research into Polar Urals deserves a story in its own right, but it is one that will have to wait for another day. We can pick up the narrative again in 2005, when McIntyre discovered that an update to the Polar Urals series had been collected in 1999. Through a contact he was able to obtain a copy of the revised series. Remarkably, in the update the eleventh century appeared to be much warmer than in the original - in fact it was higher even than the twentieth century. This must have been a severe blow to paleoclimatologists, a supposition that is borne out by what happened next, or rather what didn't: the update to the Polar Urals was not published, it was not archived and it was almost never seen again.

With Polar Urals now unusable, paleclimatologists had a pressing need for a hockey stick shaped replacement and a solution appeared in the nick of time in the shape of a series from the nearby location of Yamal.

The Yamal data had been collected by a pair of Russian scientists, Hantemirov and Shiyatov, and was published in 2002. In their version of the data, Yamal had little by way of a twentieth century trend. Strangely though, Briffa's version, which had made it into print before even the Russians', was somewhat different. While it was very similar to the Russians' version for most of the length of the record, Briffa's verison had a sharp uptick at the end of the twentieth century -- another hockey stick, made almost to order to meet the requirements of the paleoclimate community. Certainly, after its first appearance in Briffa's 2000 paper in Quaternary Science Reviews, this version of Yamal was seized upon by climatologists, appearing again and again in temperature reconstructions; it became virtually ubiquitous in the field: apart from Briffa 2000, it also contributed to the reconstructions in Mann and Jones 2003, Jones and Mann 2004, Moberg et al 2005, D'Arrigo et al 2006, Osborn and Briffa 2006 and Hegerl et al 2007, among others.

The data is not free

When McIntyre started to look at the Osborn and Briffa paper in 2006, he quickly ran into the problem of the Yamal chronology: he needed to understand exactly how the difference between the Briffa and Hantemirov versions of Yamal had arisen. McIntyre therefore wrote to the Englishman asking for the original tree ring measurements involved. When Briffa refused, McIntyre wrote to Science, who had published the new paper, pointing out that, since it was now six years since Briffa had originally published his version of the chronology, there could be no reason for withholding the underlying data. After some deliberation, the editors at Science declined the request, deciding that Briffa did not have to publish anything more as he had merely re-used data from an earlier study. McIntyre should, they advised, approach the author of the earlier study, that author being, of course, Briffa himself. Wearily, McIntyre wrote to Briffa again, this time in his capacity as author of the original study in Quaternary Science Reviews and he was, as expected, turned down flat.

That was how the the investigation of the Yamal series stood for the next two years until, in July 2008, a new Briffa paper appeared in the pages of the Philosophical Transactions of the Royal Society B, the Royal Society's journal for the biological sciences. The new paper discussed five Eurasian tree ring datasets, which, in fairly standard Hockey Team fashion, were unarchived and therefore not susceptible to detailed analysis. Among these five were Yamal and the equally notorious Tornetrask chronology. McIntyre observed that the only series with a strikingly anomalous twentieth century was Yamal. It was frustratingly therefore that he had still not managed to obtain Briffa's measurement data. It appeared that he was going to hit another dead end. However, in the comments to his article on the new paper, a possible way forward presented itself. A reader pointed out that the Royal Society had what appeared to be a fairly clear and robust policy on data availability:

As a condition of acceptance authors agree to honour any reasonable request by other researchers for materials, methods, or data necessary to verify the conclusion of the article...Supplementary data up to 10 Mb is placed on the Society's website free of charge and is publicly accessible. Large datasets must be deposited in a recognised public domain database by the author prior to submission. The accession number should be provided for inclusion in the published article.

Having had his requests rejected by every other journal he had approached, McIntyre had no great expectations that the Royal Society would be any different, but there was no harm in trying and he duly sent off an email pointing out that Briffa had failed to meet the Society's requirement of archiving his data prior to submission and that the editors had failed to check that Briffa had done so. The reply, to McIntyre's surprise, was very encouraging:

We take matters like this very seriously and I am sorry that this was not picked up in the publishing process.

Was the Royal Society, in a striking contrast to every other journal in the field, about to enforce its own data availability policy? Had Briffa made a fatal mistake?



Summer gave way to autumn and as October drew to a close, McIntyre had still heard nothing from the Royal Society. However, in response to some further enquiries, the journal sent McIntyre some more encouraging news -- Briffa would be producing most of his data, although not immediately. Most of it would be available by the end of the year, with the remainder to follow in early 2009.

Some Briffa data

The first batch of data appeared on schedule in the dying days of 2008 and it was something of a disappointment. The Yamal data, as might have been expected, was to be archived with the second batch, so there would be a further delay before the real action could start. Meanwhile, however, McIntyre could begin to look at what Briffa had done elsewhere. It was not to be plain sailing. For a start, Briffa had archived data in an obsolete data format, last used in the era of punch-cards. This was inconvenient, and apparently deliberately so, but it was not an insurmountable problem -- with a little work, McIntyre was able to move ahead with his analysis. Briffa had also thrown a rather larger spanner in the works though: while he had archived the tree ring measurements, he had not supplied any metadata to go with it -- in other words there was no information about where the measurements had come from. All there was was a tree number and the measurements that went with it. However, McIntyre was well used to this kind of behaviour from climatologists and he had some techniques at hand for filling in some of the gaps. Climate Audit postings on the findings followed in fairly short order, some of which were quite intriguing. There was, however, no smoking gun.



There followed a long hiatus, with no word from the Royal Society or from Briffa. McIntyre would occasionally visit Briffa's web page at the CRU website to see if anything new had appeared, but to no avail. Eventually, though, Briffa's hand was forced, and in late September 2009, a reader pointed out to McIntyre that the remaining data was now available. It had been quietly posted to Briffa's webpage, without announcement or the courtesy of an email to Mcintyre. It was nearly ten years since the initial publication of Yamal and three years since McIntyre had requested the measurement data from Briffa. Now at last some of the questions could be answered.

A strange lack of twentieth century data

When McIntyre started to look at the numbers it was clear that there were going to be the usual problems with a lack of metadata, but there was more than just this. In typical climate science fashion, just scratching at the surface of the Briffa archive raised as many questions as it answered. Why did Briffa only have half the number of cores covering the Medieval Warm Period that the Russian had reported? And why were there so few cores in Briffa's twentieth century? By 1988 there were only 12 cores used, an amazingly small number in what should have been the part of the record when it was easiest to obtain data. By 1990 the count was only ten, dropping still further to just five in 1995. Without an explanation of how the selection of this sample of the available data had been performed, the suspicion of `cherrypicking' would linger over the study, particularly since the sharp twentieth century uptick in the series was almost entirely due to a single tree (It is true to say, however, that Hantemirov also had very few cores in the equivalent period, so it is possible that this selection had been due to the Russian and not Briffa).



The lack of twentieth century data was still more remarkable when the Yamal chronology was compared to the Polar Urals series, to which it was now apparently preferred. The ten or twelve cores used in Yamal was around half the number available at Polar Urals, which should presumably therefore have been considered the more reliable. Why then had climatologists almost all preferred to use Yamal? Could it be because it had a hockey stick shape?

Briffa's regional chronology

The low core counts in the Yamal series certainly looked odd, but when they were seen in the context of Briffa's 2008 Royal Society paper they looked positively suspicious. In the paper, Briffa had explained that he and his co-authors had combined series so as to create regional chronologies covering much wider areas. These regional chronologies, he suggested provided "strong evidence that the extent of recent widespread warming across northwest Eurasia, with respect to 100- to 200-year trends, is unprecedented in the last 2000 years".

One of Briffa's regional chronologies was AVAM-TAIMYR, which was produced by merging the Taimyr chronology with another site, Bol'shoi Avam, located no less than 400 kilometres away. While the original Taimyr site had something of a divergence problem, with narrowing ring widths implying cooler temperatures, the new composite site of Avam--Taimyr had a rather warmer twentieth century and a cooler Medieval Warm Period. The effect of this blending of datasets was therefore, as so often with paleoclimate adjustments, to produce a warming trend.

This however, was not what was interesting McIntyre. What was odd about AVAM-TAIMYR was that the series seemed to have more tree cores recorded than had been reported in the two papers on which it was based. So it looked as if something else had been merged in as well. But what?



With no metadata archived for AVAM-TAIMYR, McIntyre had another puzzle to occupy him, but with some effort he was able to unravel the mystery. Forty-two of the cores turned out to be from another location called Balschaya Kamenka, some 400 km from Taimyr. The data had been collected by the Swiss researcher, Fritz Schweingruber. The fact that the use of Schweingruber's data had not been reported by Briffa was odd in itself, but what intrigued McIntyre was why Briffa had used Balschaya Kamenka and not any of the other Schweingruber sites in the area. Several of these were much closer to Taimyr -- Aykali River was one example, and another, Novaja Rieja, was almost next door. The suspicion of cherrypicking was hard to avoid.

The Khatdyta River experiment

But there was another mystery in Briffa's paper too. As we have seen, Briffa had been in the business of creating regional chronologies, for example supplementing Taimyr with data from other locations such as Avam and Balschaya Kamenka. Similar regional chronologies had been created for Fennoscandia and allegedly for Yamal. But the Yamal data appeared to represent only the original Hantemirov and Shiyatov data with no supplementation with other sites in the area at all. Why had Briffa left Yamal on its own, when the core count was so low? Suitable data was certainly available - Schweingruber had collected samples at a site called Khadyta River, close to Yamal, and with 34 cores recorded it represented a much more reliable basis for reconstructing temperatures.

McIntyre decided to perform a sensitivity test on Briffa's database, replacing the 12 cores that were behind the twentieth century uptick in the Yamal series with the 34 from Khadtya River. The revised chronology was simply staggering. The sharp uptick in the series at the end of the twentieth century had vanished, leaving a twentieth century apparently without a significant trend. The blade of the Yamal hockey stick, used in so many of those temperature reconstructions that the IPCC said validated Michael Mann's work, was gone.





Sound and fury

The reaction to McIntyre's blog posts on Yamal was almost instantaneous. The RealClimate blog, run by prominent climate scientists in an effort to protect the IPCC orthodoxy, ridiculed McIntyre's work:

McIntyre has based his ‘critique’ on a test conducted by randomly adding in one set of data from another location in Yamal that he found on the internet. People have written theses about how to construct tree ring chronologies in order to avoid end-member effects and preserve as much of the climate signal as possible. Curiously no-one has ever suggested simply grabbing one set of data, deleting the trees you have a political objection to and replacing them with another set that you found lying around on the web.

A few weeks later, Briffa and some of his colleagues joined in, writing a long response to McIntyre. Interestingly, this took a slightly different line to their colleagues at RealClimate, acknowledging that Khadtya River met the criteria for inclusion in the the Yamal chronology, but claiming somewhat implausibly that they had not considered it at the time.

Judged according to [our normal] criterion it is entirely appropriate to include the data from the [Khadtya River] site...when constructing a regional chronology for the area. However,we simply did not consider these data at the time, focussing only on the data used in the companion study by Hantemirov and Shiyatov and supplied to us by them.

However, they also presented what they said was a revised Yamal chronology, produced "by making use of all the data to hand", and giving broadly the same result as the figures they had published previously:

Original caption: Comparison of published and reworked Yamal chronologies. This Figure shows the two earlier versions of the Yamal RCS larch chronology in red (published in Briffa, 2000) and blue (Briffa et al., 2008) compared to the new version, based on all of the currently available data (Yamal_All) for the original (POR, YAD and JAH) sites and including the additional data from the KHAD site (in black). Tree sample counts for this 'new' chronology are shown by the grey shading. The upper panel shows the data smoothed with a 40-year low-pass cubic smoothing spline. The lower panel shows the yearly data from 1800 onwards. All series have been scaled so the yearly data have the same mean and standard deviation as the Yamal_All series over the period 1-1600.

Climategate and the Yamal-Urals chronology

Just weeks later, the attention of the Climate blogosphere was well and truly diverted by the Climategate disclosures, and the arguments over the Yamal core count was forgotten in the media storm that followed. However, there were many emails in the Climategate zip file that directly pertained to the Yamal story. The message that immediately attracted attention was one that demonstrated that CRU had funded Hantemirov and Shiyatov to collect the Yamal data in the first place, something that did raise questions over Briffa's claim that the data was not theirs to give to McIntyre. However, at the time most attention was focused on Shiyatov's request that the funds be sent to his private bank account so as to avoid problems with the Russian tax authorities.

However, there was another email that was much more important, although it was barely noticed by anyone apart from McIntyre. The email in question, number 1146252894, was from Briffa to a scientist at the Met Office and dated back to 2006.

Hi Philip,

We have three "groups" of trees:

"SCAND" (which includes the Tornetrask and Finland multi-millennial chronologies, but also some shorter chronologies from the same region). These trees fall mainly within the 3 boxes centred at:

17.5E, 67.5N

22.5E, 67.5N

27.5E, 67.5N

"URALS" (which includes the Yamal and Polar Urals long chronologies, plus other shorter ones). These fall mainly within these 3 boxes:

52.5E, 67.5N

62.5E, 62.5N (note this is the only one not at 67.5N)

67.5E, 67.5N

"TAIMYR" (which includes the Taimyr long chronology, plus other shorter ones). These fall mainly within these 4 boxes:

87.5E, 67.5N

102.5E, 67.5N

112.5E, 67.5N

122.5E, 67.5N

There could be little doubt that these were the regional chronologies that had been prepared for the Royal Society paper. Crucially then, this email showed that Briffa and his colleagues had prepared a regional chronology that incorporated Yamal and the Polar Urals - a much wider area than the Yamal-only chronology that had appeared in the final paper.

Briffa's decision to drop the URALS regional chronology (incorporating Yamal and sites in the Polar Urals) in favour of a Yamal-only chronology with only a handful of trees in its modern section was starting to look indefensible. It was also hard to square the existence of the URALS chronology with Briffa's rebuttal of McIntyre's earlier blog post. When Briffa had said his revised Yamal chronology incorporated "all the data", he had actually only incorporated a handful of sites in the Yamal area, without mentioning that he had prepared the much broader-based URALS chronology. The deception in Briffa's response was now clear, at least to McIntyre.

The only way to prove that all this mattered, however, was to find out what the URALS chronology looked like. If it lacked the hockey stick shape, as McIntyre suspected, Briffa would be completely undone. McIntyre duly submitted a freedom of information request for the chronology itself and a list of the sites used. As he expected, this was refused, and before long the long and tedious appeals process was set in motion.

The Climategate inquiries

While the wheels of the FOI appeals process were grinding away, Briffa was having to fend off the Yamal allegations again. These had been put to him by members of the Russell panel, which was investigating Climategate on behalf of the University of East Anglia. However, a former colleague, Geoffrey Boulton, was dealing with the investigation of the Yamal allegations, so Briffa would presumably have had few concerns.

Since the Russell panel had only allowed a few days for submissions of evidence, the Yamal allegations had not been formally put to the inquiry, although McIntyre had covered it briefly, mentioning the email that revealed existence of the URALS chronology. Boulton, however, had asked Briffa to respond to a distillation of the allegations that McIntyre's co-author Ross McKitrick had written as an op-ed in a Canadian newspaper.

Briffa's response repeated the story he had made in his earlier rebuttal of McIntyre, namely that he and his colleagues had never considered the other data in the region:

The data that he is referring to were never considered at the time because the purpose of the work reported in Briffa (2000) and Briffa et al. (2008) was to reprocess the existing dataset of Hantemirov and Shiyatov (2002).

Readers may already have noticed that Briffa's explanation was directly contradicted by the Climategate email quoted above, which showed that the data referred to had actully been under consideration since at least 2006. Remarkably, despite having Briffa's email available, Boulton does not seem to have noticed this obvious deceit. Since the Russell panel decided not to allow CRU's critics to challenge the evidence of the CRU scientists, nobody else could point out the truth either. The Russell panel's decision therefore appears culpable.

The second part of Briffa's explanation is equally misleading. While the purpose of the original Briffa 2000 paper was indeed to reprocess the Hantemirov and Shiyatov data, this could not be said of the later, 2008 paper, the purpose of which was to prepare and examine regional chronologies. This second deceit, like the first, went unremarked by Boulton.

Later on in the same document, Briffa changed his story slightly. He said that the 2008 paper had been intended for a special issue of the journal concerned and he had therefore been working to a strict deadline. He said that he and his colleagues had simply run out of time to prepare the URALS regional chronology and had decided to reprocess Yamal on its own:

...we had intended to explore an integrated Polar Urals/Yamal larch series but it was felt that this work could not be completed in time and Briffa made the decision to reprocess the Yamal ring-width data to hand, using improved standardization techniques, and include this series in the submitted paper.

This story was again entirely implausible. As we have seen, the regional chronology had been around since at least 2006, but Briffa chose not to mention this in his evidence to Russell.

Briffa also made the extraordinary claim that he and his colleagues had not looked at Polar Urals for many years:

We had never undertaken any reanalysis of the Polar Urals temperature reconstruction subsequent to its publication in 1995.

Again, this was directly refuted by the Climategate emails, which showed that Briffa had incorporated Polar Urals into the URALS regional chronology in 2006. Once again, the deceit was missed by Boulton.

Boulton had missed three clear deceptions in as many paragraphs of Briffa's evidence. It is perhaps not surprising that with an inquiry of this nature, a "not guilty" verdict regarding the Yamal allegations was subsequently issued in the Russell Panel's final report.

The Commissioner calls



In April 2012, the tide began to shift against Briffa. The UK's Information Commissioner wrote to UEA with some bad news - although a decision on the release of the URALS chronology had not been reached, the commissioner advised the university that there could be no good reason not to disclose the list of sites used and accordingly he intended to rule against them on this issue. Briffa's hand was finally going to be forced.

The list of 17 sites that was finally sent to McIntyre represented complete vindication. The presence of Yamal and Polar Urals had already been obvious from the Climategate emails, but the list showed that Briffa had also incorporated the Polar Urals update (which, as we saw above, did not have a hockey stick shape, and which Briffa claimed he had not looked at since 1995) and the Khadtya River site, McIntyre's use of which the RealClimate authors had ridiculed.

Although the chronology itself was not yet available, the list of sites was sufficient for McIntyre to calculate the numbers himself, and the results were breathtaking. Firstly, the URALS regional chronology had vastly more data behind it than the Yamal-only figures presented in Briffa's paper

But what was worse, the regional chronology did not have a hockey stick shape - the twentieth century uptick that Briffa had got from the handful of trees in the Yamal-only series had completely disappeared.

Direct comparison of the chronology that Briffa chose to publish against the full chronology that he withheld makes the point clear:

It seems clear then that the URALS chronology Briffa prepared to go alongside the others he put together for the 2008 paper gave a message that did not comply with the message that he wanted to convey - one of unprecedented warmth at the end of the twentieth century. In essence the URALS regional chronology was suffering from the divergence problem - the widely noted failure of some tree ring series to pick up the recent warming seen in instrumental temperature records, which led to the infamous 'hide the decline' episode.

Remarkably, however, Briffa did allude to the divergence problem in his paper:

These [regional chronologies] show no evidence of a recent breakdown in [the association between tree growth and temperature] as has been found at other high-latitude Northern Hemisphere locations.

The reason for dropping the URALS chronology looks abundantly clear. It would not have supported this message.