

Euler appeared, advanced toward Diderot, and in a tone of perfect conviction announced, “Sir, , hence God exists—reply!”.



One is tempted to be amazed how often such arguments are made about serious issues. But it shouldn’t be such a surprise when one considers how often people are taken in by such sophistry. If you can’t persuade them with logic, dazzle them with bullshit.



A recent comment by “VS” at Bart Verheggen’s blog suggests:



In other words, global temperature contains a stochastic rather than deterministic trend, and is statistically speaking, a random walk. Simply calculating OLS trends and claiming that there is a ‘clear increase’ is non-sense (non-science). According to what we observe therefore, temperatures might either increase or decrease in the following year (so no ‘trend’).



The original “random walk” was posited by Karl Pearson in a letter to Nature in 1905. In Pearson’s version, a man starts at the origin and walks a fixed distance in any direction. He then walks the same distance again, in some randomly chosen direction. This process repeats. Pearson was interested in the probability that after n such steps, the man had travelled distance r from his starting point. Pearson’s question was answered a week later by Lord Rayleigh, who had already worked out the solution as it was related to some problems in physics (relevant to diffusion). Rayleigh also worked out the random walk in which the length of each step was not constant.

Ironically, it was also in 1905 that Einstein published his work on the Brownian motion — the small jittery motion of microscopic particles (like dust or pollen grains) suspended in a fluid. Einstein posited that the jitters were caused by the collision of individual molecules of the fluid with the microscopic particles, and used observations of the Brownian motion to deduce the probable size of the molecules in the fluid, one of the first realistic estimates of the size of individual molecules.

The essence of a random walk is that it is the cumulative sum of random terms. We can generate a 1-dimensional random walk by generating random numbers (we could use, for example, Gaussian white noise) and summing them up. This would give us a series, which we can posit as a time series

,

where are Gaussian white noise so is a random walk. I’ve generated 10 random walks of 100 steps using this procedure and plotted the 10 resultant time series here:

We can write our random walk as

.

We can also write this as

.

We can even define the lag operator L as the operator which transforms a time series to its 1-time-previous values

,

to write the time series as

.

It’s also common to define the difference operator as

,

so that

.

Random walks are not just simple random series, and they’re not “stationary” time series. Because of this, they can give a false impression of the presence of a time trend. If, for instance, I fit a trend line to the first random walk I generated, I get this:

and the test statistic (t=13.2) indicates that it is definitely statistically significant. But this is a case where we need to be aware of what the test statistic means. The null hypothesis is that the data are white noise. The significant test statistic means we can reject that null hypothesis. That only means that the data are not white noise — which is correct. It does not mean that the data exhibit a linear trend over time.

Our form for defining the random walk is similar to the form of an ARMA (autoregressive moving-average) process. An ARMA(p,q) process (autoregressive moving-average of order p,q) is

,

or, using the lag operator,

,

where is the operator which defines our AR(p) process and defines the MA process. The random walk we’ve defined so far is such a process, with order p=1 and q=0, and AR coefficient . For an AR process we can define the characteristic polynomial as

.

We can then study the roots of the characteristic polynomial. If one of the roots is equal to 1 (a “unit root”), then we can factor the AR operator into the form

.

Then our process is

.

The “1-L” operator (the difference operator) can be thought of as an “integration” operator, so such a process is called an integrated autoregressive moving-average (or ARIMA) process. We can sometimes factor multiple instances of the difference operator out of the AR process, giving an ARIMA(p,d,q) process: AR of order p, integrated d times, MA of order q.

If the AR operator has a unit root (so that we can factor out a difference operator), we tend to classify it as ARIMA rather than ARMA. We also know that the time series is not stationary — it doesn’t show behavior which is essentially the same at different times. A random walk, for instance, shows ever-growing variance, so that as time continues indefinitely into the future it can wander off without bound. A random walk is unbounded.

Yet we know that global temperature is bounded. Therefore it’s not a random walk. “VS” replied to this saying



Temperature may be ‘bounded’ over it’s long 100,000 year cycle (as observed over the past 500,000 or so years), however, on the subset of a 150 years or so, on which we are formally studying it, it can be easily classified as a random walk.



It’s interesting that a simple, one-sentence fact (temperature is bounded) already forces “VS” to move the goalposts.

How can we tell the difference between a genuine trend, and an apparent one due to some process that has a unit root? One approach is to apply a unit root test. The most straightforward is the Dickey-Fuller, or DF test. It’s based on the idea that a stationary process tends to return to its mean value, while a unit-root process has no such tendency, it wanders in a random way regardless of its present value. This means that the value of a given increment (change from one time series value to the next) doesn’t really depend on the preceding time series value . It might depend on preceding increments, but not on actual values.

Therefore the Dickey-Fuller test performs a regression of increments on the preceding time series value ,

,

and tests whether the regression is significant. Note that although is often used to indicate a difference operator itself, in this context it’s just a number. If the process lacks a unit root, then when is high, the process will tend to return to its mean value so that will be negative. So we test whether or not the coefficient is significantly negative (the Dickey-Fuller test is a one-tailed test). The null hypothesis is that the process is a unit-root process, so that . If we reject the null hypothesis, we reject the presence of a unit root.

We must also be careful because the null hypothesis is not that the time series is white noise, so we can’t apply the usual t-test to the regression. Instead we compute a t value, but compare it to the Dickey-Fuller t distribution.

That’s fine as far as it goes, but if the process has a genuine trend but no unit root, then it won’t tend to return to a “mean value,” it will tend to return to the value of the trend line. So there are two further versions of the Dickey-Fuller test. One tests for the presence of a unit root in the presence of drift, using the regression

,

and the other tests for the presence of a unit root when there’s a deterministic trend,

,

In all cases the null hypothesis is that a unit root is present. The Dickey-Fuller test is known to have low statistical power when the time series lacks a unit root but does show strong autocorrelation, so it may well fail to reject the null hypothesis even when it’s false.

Of course, the series of increments may not be white noise either. It may be autocorrelated noise. This leads to the augmented Dickey-Fuller, or ADF test. We test the regression of increments on the preceding value of the time series , and on preceding values of the increments

,

up to some order p. There are also, just as with the DF test, versions to allow for drift and trend

,

.

Another choice is the Phillips-Perron, or PP test. This uses a nonparametric alternative to regressing on lagged increments, and allows for different behaviors of the underlying random process (like changes in its variance).

It’s crucial to note that in the presence of a real trend, we have to use the versions of the ADF test which allow for it. Suppose, for instance, we generate some artificial data which are a linear trend plus white noise. I generated 130 such values here:

There’s nothing complicated about this time series, it’s a linear trend plus white noise. Let’s apply the ADF test using the R package “CADFtest.” It implements the covariate-adjusted ADF test, which allows us to test a time series along with some covariates, but if we don’t supply covariates it just computes the straight ADF test. It defaults to allowing for a trend, but if we disallow that using the command

CADFtest(y,type=”none”)

we get

ADF test data: x

ADF(1) = 1.4382, p-value = 0.9623

alternative hypothesis: true delta is less than 0

sample estimates:

delta

0.01416970

Note that it has failed to reject the null hypothesis (p-value=0.9623, so we can’t even reject at 5% confidence let alone 95% confidence), therefore it indicates the possibility of a unit root. But we know that’s not the case! We can also apply the ADF test allowing for drift but not trend

CADFtest(y,type=”drift”)

which gives

ADF test data: x

ADF(1) = -0.7466, p-value = 0.83

alternative hypothesis: true delta is less than 0

sample estimates:

delta

-0.01466259

Again we have failed to reject the null hypothesis, indicating the possibility of a unit root. But we still know that’s not so! The failure is because we have specifically excluded the possibility of an actual trend. When we allow for that

CADFtest(y) -or- CADFtest(y,type=”trend”)

we get

ADF test data: x

ADF(1) = -8.8218, p-value = 1.968e-11

alternative hypothesis: true delta is less than 0

sample estimates:

delta

-1.184363

Now the null hypothesis is rejected at a significance level of 2 x 10^-11 (99.999999998% confidence). There’s no unit root (which we already knew).

If we apply the ADF test to the random walk to which we fit a (falsely significant) trend, we get

ADF test data: y

ADF(1) = -2.6338, p-value = 0.2666

alternative hypothesis: true delta is less than 0

sample estimates:

delta

-0.151474

We fail to reject the unit-root hypothesis, as we should, since this is a random walk. If we restrict the ADF test to exclude a trend, or to exclude drift and trend, we get the same result. We can further test for a unit root with the PP test, which gives

Phillips-Perron Unit Root Test data: y

Dickey-Fuller = -2.8724, Truncation lag parameter = 3, p-value = 0.2152

Again we can’t reject the unit root, and again that’s because there is one.

If you’ve read this far, you must be wondering what we get when we apply the ADF or PP unit-root tests to actual temperature data. Let’s take GISS data, annual averages from 1880 to 2009. First let’s look at the results supplied by “VS”:



** GISSTEMP, global mean, 1881-2008:

Level series, ADF test statistic (p-value<):

-0.168613 (0.6234)

First difference series, ADF test statistic (p-value<):

-11.53925 (0.0000) Conclusion: I(1) ** GISSTEMP, global mean, combined, 1881-2008:

Level series, ADF test statistic (p-value<):

-0.301710 (0.5752)

First difference series, ADF test statistic (p-value):

-10.84587 (0.0000) Conclusion: I(1)



I’m not sure why he’s using two GISSTEMP series, or what he means by “combined,” or why he uses 1881-2008 (since GISS extends from 1880 to 2009). But I ran the ADF test on GISS data (1880-2009) and got this:

ADF test data: x1

ADF(1) = -4.2506, p-value = 0.005066

alternative hypothesis: true delta is less than 0

sample estimates:

delta

-0.3308484

The null hypothesis, of a unit root, is resoundingly rejected. We can also do the PP test, giving

Phillips-Perron Unit Root Test data: x1

Dickey-Fuller = -5.1747, Truncation lag parameter = 4, p-value = 0.01

Again, the unit root is rejected.

The ADF test as implemented in the R package CADFtest also enables the user to allow for an excessively large number of lagged increment values. This is not recommended, but I did so anyway, with model selection by BIC (Bayesian Information Criterion). It doesn’t change the result. The unit root is rejected.

How did VS fail to reject it? I suspect he excluded a trend from his ADF test. He may also have played around with the number of lags allowed, until he got a result he liked. He excluded reality, and if you do that, you can “prove” whatever you want.

One final note: there’s an ever-growing number of “throw some complicated-looking math at the wall and see what sticks” attempts to refute global warming. It seems to me that a disproportionate fraction of them come from economists. Perhaps that’s because they fear the loss of corporate profit more than they fear danger to the health and welfare of humanity. Or perhaps it’s just a reflection of the rather poor track record of economists in general. When it comes to predicting the future, it’s well to compare the truly astounding successes of, say, physics, to, say, economics.