Climate Central has an interesting post about the extreme heat wave in Moscow this last July. They point out that if we assume the data are normally distributed, then the July 2010 average temperature anomaly value was more than 4 standard deviations above the July mean (and they have a lovely graph to emphasize it):



What’s the chance of such a deviation from the norm? For a normally distributed variable, they say, “This probability turns out to be on the order of a one and a half chance in 100,000 for the July anomaly.”

They only used data since 1950, but data are available for more than a century. Here’s the monthly anomaly data for Moscow (anomaly relative to the entire time span), for all months (not just July), together with 5-year averages (in red):

The extremely hot July is the spike furthest to the right. But it doesn’t seem to be exceptional at all. The appearance of not-that-extreme is because it happened in July, and the natural variation in July is less than the average throughout the year. This is clearly visible in a plot of temperature anomaly as a function of month:

It’s also visible in a boxplot by month, which shows that last July’s anomaly (the highest “dot” on the graph) was an outlier, but wouldn’t have been in a winter month:

The winter months show considerably more natural variation than the summer months, so an extreme value in July is more deviant from “normal” than the same extreme value in February. So, let’s look at last July’s Moscow temperature compared to previous Julys:

Clearly last July’s Moscow heat (on the far right, circled in red) was exceptional. But was it a one-and-a-half-in-100000-years event?

Do Moscow July temperatures follow the normal distribution? If so, then a “quantile-quantile plot” (QQ-plot) should roughly follow a straight line — but it doesn’t:

There’s an upward bend on the right, indicating that extreme high temperatures are more common than with a normal distribution. Also, the Shapiro-Wilk test for normality resoundingly rejects the normal distribution. Even if we omit this last July, the bend in the QQ-plot is still there and the Shapiro-Wilk test still rejects the normal distribution.

Also, higher moments of the distribution indicate positive excess kurtosis — which can (but doesn’t necessarily) indicate a heavy-tailed distribution with more chance of extreme values — as well as positive skewness — which indicates that the right-hand tail (extreme highs rather than extreme lows) is more prominent than the left-hand tail. Both of which increase the chances of such an extremely hot July.

So the normal-distribution assumption is right out.

We’re really interested in the distribution of extreme values for July temperature. We can apply extreme value theory to approximate that distribution. Fundamental theorems in exreme value theory reveal that in well-behaved cases (which we expect) the distribution of extreme values will follow one of a small number of possible distributions. This is the extreme-value analagy of the central limit theorem.

In fact there are three possible limiting distributions for extreme values. Which one applies to a given situation depends on how the cumulative distribution function (cdf) approaches 1 as data values go to infinity. The cdf is the probability that a value will be less than or equal to a given value

.

As increases to infinity, the cdf must approach its limiting value of 1 because the probability of a data value being less than or equal to infinity, is 1 (i.e., certainty).

A convenient way to study the asymptotic behavior of the cdf is by examining what’s called the survival function, which is just 1 minus the cdf

.

The distribution function asymptotically approaches one as the data value increases, so the survival function asymptotically approaches zero. It turns out that under most circumstances there are a limited number of ways to do that.

I fit an extreme-value distribution to these data, and being conservative I find that the recent July heat wave is hardly a one-and-a-half-in-100000-years event, it’s really only a 1-in-260-years event. So in terms of its extreme deviation from “normal” July temperature, it’s not the extraordinary event some have suggested.

One of the most interesting facts is that, if not for global warming, this would have been an extraordinary July. That’s because global warming has increased the mean July temperature in Moscow, so a given deviation above the mean corresponds to a hotter temperature. Without global warming, this once-in-a-century-or-two event would have been closer to a once-in-a-millenium event.

Estimating extreme value distributions with only a little more than a century of data is imprecise; with little data, we have even less extreme data on which to base a model. In spit of this limitation, we can get a decent rough idea of the likelihood of extreme events. And the bottom line is that every degree Celsius increase in mean July temperature in Moscow, roughly doubles the chances of any given extreme heat wave. In fact Moscow temperature has increased as much as 3 deg.C since the early 20th century, and according to the extreme-value approximation model I computed, this makes a given extreme 8 times more likely than before.

Without global warming, Moscow’s July 2010 would have been one for the history books. As global warming drives average temperatures even higher, present citizens of Moscow are likely to see multiple such events in a single lifetime. Which is scary.