There have been a lot of people suggesting that Harvey the Hurricane shows that “really and truly climate change is happening, see, in-your-face deniers!”

Of course, it’s possible, even though the actual evidence — including the 12-year drought in major hurricanes — is against it. But hurricanes are a perfect opportunity for stupid math tricks. Hurricanes also provide great opportunities to explain concepts that are unclear to people. So, let’s consider the concept of a “500-year flood.”

Most people hear this and think it means “one flood this size in 500 years.” The real definition is subtly different: saying “a 500-year flood” actually means “there is one chance in 500 of a flood this size happening in any year.”

It’s called a “500-year flood” because statistically, over a long enough time, we would expect to have roughly one such flood on average every 500 years. So, if we had 100,000 years of weather data (and things stayed the same otherwise, which is an unrealistic assumption) then we’d expect to have seen 100,000/500- or 200 500-year floods [Ed. typo fixed] at that level.

The trouble is, we’ve only got about 100 years of good weather data for the Houston area.

The estimate that this scale flood happens with a one in 500 chance, then, is made using a statistical model of how often a flood happens. Using the data from Corpus Christi for illustration (because they have data for every year from 1895 to 2000), you get a distribution like this:

The blue bars represent real measurements in inches of precipitation in a year; the blue line is the smoothed approximate distribution you get from the data.

This is an example of the distribution you get if you simulate rainfall amounts, assuming it’s a normal distribution. Notice it doesn’t look like a bell curve either — it’s only 105 samples. The more samples we take, the more it would look like a true normal distribution.

And this is the computed normal distribution. When they estimate what a 500-year flood is like, they assume some distribution — my guess is a normal distribution because, as I said, everything has a normal distribution and from the data and the simulation runs, it looks like a good assumption.

In all of these graphs, the vertical axis for the bar chart shows how many years had some range of amounts of rain, and the vertical axis is then interpreted as the probability of some amount of rain, and the horizontal axis is the amount of rain. So, to estimate what a 500-year flood will be, you draw a horizontal line at the 1 in 500 height on the vertical, and see where it crosses the curve; draw a vertical down, and that gives how much rain a one in 500 flood is.

Here’s the trick, though — this distribution is still just a model. We look at the data we have, estimate two parameters — the mean and the variance, the “width” of the distribution — and then estimate the probabilities from that model. But the model can be wrong. So the whole estimate of what a one in 500 flood is might also be wrong. (This is leading meteorologists, hydrologists, and so forth to question the utility of even making 500- or 1000-year flood estimates.)

What do we do if the model is wrong? Eventually, we would want to change the model. This is a topic for a whole article in itself; in any case, the estimates of what a 100-, 500-, and 1000-year flood would be aren’t changed with every big rainfall.

But just for the moment, let’s assume that the estimated distribution is right: this really represents a true 500-year flood. We don’t have much evidence against it; remember we’ve only got about 100 years of actual data. So, what are the chances that we’ll have another 500-year storm next year?

Most people would say it has to be less likely — we’ve already had our 500-year storm. This is known as the Monte Carlo fallacy (or gambler’s fallacy, but “Monte Carlo” just seems so much classier). Consider flipping a coin, the favorite example in probability and statistics since time immemorial. Flip your coin three times, and say it comes up heads all three times. What are the chances it will come up heads a fourth time? Exactly 50/50. The previous coin-flips don’t matter; coins have no memory. In the same way, to a first approximation, the weather has no memory from year to year either.

This has another surprise hiding in it though: it’s natural to think that, since we had a 500-year event, it should be roughly 500 years to the next event, but this is another example of the Monte Carlo fallacy. To see this, let’s go back to flipping our coin again. We’re just going to flip it four times, because there are 16 possibilities and I don’t want to consume too much space. Remember, we are looking for a case where the heads are evenly spaced, which is going to be HTHT or THTH.

TTTT HHHH

TTTH HHHT

TTHT HHTH

TTHH HHTT

THTT HTHH

THTH HTHT

THHT HTTH

THHH HTTT

Out of the 16 possible sequences of four coin flips, it’s two out of 16, or one in eight. The chances of getting any other sequence — in other words, with the heads and tails irregularly spaced — is seven in eight. Having the heads and tails equally spaced is literally the least probable sequence. This applies to 500-year storms as well — having 500-year storms exactly evenly spaced, or even roughly evenly spaced, is the least likely case.

The lesson here — remember there’s a quiz on Friday — is that a big storm, in itself, is evidence of nothing more than the fact that there’s been a big storm. Not God’s wrath, not “instant karma,” not climate change. People who say otherwise are trying to sell you something.