Pat Michaels claims (also here) that the journal Nature has lost its credibility. That’s an extraordinary claim, considering that Nature is one of the most prestigious peer-reviewed science journals in the world. There are those who believe Pat Michaels is the one lacking any credibility.

Michaels’ problem with Nature is that it publishes scientific research on the subject of global warming which he doesn’t like. His latest beef is with the publication of Booth et al., Aerosols implicated as a prime driver of twentieth-century North Atlantic climate variability. It casts (further) doubt on one of the favorite claims of fake skeptics, that entirely natural variations in the north Atlantic ocean, such as the AMO (Atlantic Multidecadal Oscillation), are responsible for some or perhaps even most of the global warming over the last century.

To counter Booth et al., Michaels touts Chylek et al., Greenland ice core evidence for spatial and temporal variability of the Atlantic Multidecadal Oscillation. Michaels compares them thus:



And Chylek and colleagues had this to say about the mechanisms involved:

The observed intermittency of these modes over the last 4000 years supports the view that these are internal ocean-atmosphere modes, with little or no external forcing.

Better read that again. “…with little or no external forcing.” Chylek’s conclusion is vastly different from the one reached by Booth et al., which in an Editorial, Nature touted as [emphasis added]:

[B]ecause the AMO has been implicated in global processes, such as the frequency of Atlantic hurricanes and drought in the Sahel region of Africa in the 1980s, the findings greatly extend the possible reach of human activity on global climate. Moreover, if correct, the study effectively does away with the AMO as it is currently posited, in that the multidecadal oscillation is neither truly oscillatory nor multidecadal.

Funny how the ice core records analyzed by Chylek (as opposed to the largely climate model exercise of Booth et al.) and show the AMO to be both oscillatory and multidecadal — and to be exhibiting such characteristics long before any possible human influence.



Clearly Michaels is convinced that Chylek et al. is right and Booth et al. is wrong about north Atlantic climate variability, and that the AMO is a real phenomenon.

Unfortunately for Chylek et al., their claims don’t hold water. They have committed one of the most common mistakes in time series analysis, one which convinces them of the existence of oscillatory behavior when no such claim is justified by the data.

They studied data for d18O from ice cores in Greenland and northernmost Canada, looked for periodic behavior, and believed they had found it. One of the prime (and most relevant) examples is data from the Dye3 ice core:

They split the data into four segments and Fourier-analyzed each to look for oscillatory behavior. The most recent is from 930 AD to 1872 AD, which gives them this spectrum:

It’s the peak labelled “1” on which they base their claim that “In the southern region (Dye 3 site) the dominant multidecadal periodicity is again ~20 years.” That peak certainly does rise above the line they’ve labelled “95%,” meaning 95% confidence, statistically. But it’s not.

What they’ve plotted is actually an averaged spectrum. They first detrended the Dye3 data, then computed the FFT (fast Fourier transform), then averaged the results over 9 consecutive frequencies. When I do the same, I get a very similar result:

The results are extremely similar but not exactly so. It’s hard to be sure why because it’s hard to be sure exactly what they’ve done, because there’s not much discussion of the exact details of their analysis. For instance, many programs (R for instance), when asked to compute a spectrum using FFT, will automatically apply a taper at the edges of the data in order to reduce spectral leakage, and some will pad the series with zeros to adjust the number of data so that the FFT will be especially fast. I didn’t do any of those things. It also looks like they may have simply smoothed the plot of the spectrum a little (I hope they didn’t oversample!). But the differences in our results are minor — we’re certainly in the same ballpark.

For an ordinary Fourier spectrum, each power level is often treated as proportional to a chi-square statistic with two degrees of freedom. But when you average over 9 consecutive frequencies, it’s proportional to a chi-square statistic with 18 degrees of freedom. If the noise were white noise, then the “critical values” (the level of the lines labelled “95%” etc.) would be the same for all frequencies (all periods). But the levels of their lines labelled “95%” etc. depend on period. That’s because they’ve corrected for the fact that the noise isn’t white, the data show autocorrelation:

Again it’s unclear exactly how they’ve done this, and what autocorrelation parameters they chose. I applied an AR(1) model estimating the autocorrelation from the sample ACF, which gives me this (dashed red line) for the 95% confidence limit:

Once again our results aren’t exactly the same, but we’re still quite close, and quite clearly, in the same ballpark. In fact I assign slightly higher significance to the main peak than Chylek et al.

So why do I say that there’s no evidence of oscillatory behavior, and the Chylek et al. is wrong? Because this significance level is for testing a single period — one and one only. When you test more periods (and that’s what Fourier analysis is all about), you have lots more chances to cross that critical value with your test statistic. After all, even if the data are nothing but noise we still expect 5% of all tested periods to yield a test statistic which exceeds the 95% confidence limit.

When I adjust the critical value to account for this, the given peak is no longer significant. Not even close. It just ain’t so.

If you’re reluctant to rely on all the complications of applying all this theory (such reluctance is well advised), you could just do some Monte Carlo simulations. I created 200 artificial AR(1) noise series with the same autocorrelation as the Dye3 data and subjected them to the same analysis, in order to estimate the probability density function of the maximum power level if the data are “nothin’ but noise.” The very first simulated series — right out of the box — gave a spectrum which looks eerily similar to that from the Dye3 data:

Taking the 200 simulations as a whole, here’s a histogram of the observed peak values, with the peak value for the Dye3 data indicated by the dashed red line:

It’s abundantly clear, whether you compute the critical level theoretically (allowing for testing multiple frequencies) or estimate it from Monte Carlo simulations, that the observed peak value from the Dye3 data is not significant. Not even close. It just ain’t so.

We’ve actually already addressed exactly this very statistical issue, in relation to exactly this very situation: analysis of ice core data used as a proxy for AMO (by Knudsen et al., Tracking the Atlantic Multidecadal Oscillation through the last 8,000 years). But there are some notable differences between the two papers. Knudsen et al. were keenly aware of these issues, as was evident when I inquired of Dr. Knudsen about some of my concerns. That’s probably why they were cautious in drawing definitive conclusions in spite of the fact that their evidence was, it seems to me, quite a bit stronger than that of Chylek et al. On the other hand, reading the Chylek paper gives the distinct impression that they have no doubt whatever about the validity of their conclusion, in spite of the fact that — as we have seen — the evidence just isn’t there.

Another difference is that Knudsen et al. provided far more detail about their analysis methods. In my opinion, one of the annoying things about Chylek et al. is how the analysis details are glossed over (I’m a bit surprised that this wasn’t a major issue during the peer-review process). They applied wavelet analysis, for instance, but there’s no clue what wavelet method or program they used. And they managed to overinterpret the wavelet analysis — in the extreme — just as they did with the Fourier analysis.

I guess I shouldn’t blame Chylek et al. too much, because as I said at the outset, overinterpretation of Fourier analysis — especially the identification of periods for which there is nowhere near sufficient evidence — is one of the most common problems in the peer-reviewed scientific literature. But I do take exception to the extreme confidence they attach to their conclusions.

And, I have some sage advice for anyone who is doing analysis this complex. We have these things called “computers,” so run some damn Monte Carlo simulations — the theory can get extremely complicated with lots of ways to go astray, and Monte Carlo is a great way to get a basic reality check on your results.

As for Pat Michaels, I definitely blame him for pontificating about papers which, in my opinion, he doesn’t have the “skillz” to evaluate.