Crime rates in big cities (where gasoline concentration is high) peaked about 20 years after lead was banned in gasoline, according to an econometric study by Rick Nevin. The 20 year time lag being the time elapsed between lead exposure at birth, and turning into a 20 year old criminal.

At least, that's the argument proposed by some well known econometricians, based on crime rates analysis over time in large cities. In my opinion, this is another example of of study done using the wrong kind of design of experiment, where statistical science is being abused or misused by people who claim to be experts.

You can read the article here.

So how would you fix this study?

Here's my solution:

Get a well balanced sample of 10,000 people over a 30 years across all cities, split the sample into two subsets (criminals vs. non criminals), and check (using an odds ratio) whether criminals are more likely to have been exposed to lead at birth, than non criminals. In short, do the opposite of what Rick Kevin did: look at individuals rather than cities, that is, look at the micro rather than macro level, and perform a classic test of hypothesis using standard sampling and proper design of experiment (DOE) procedures.

Alternatively, if you really want to work on the original macro-level time series (assuming you have monthly granularity) then perform a Granger causality test : it will take into account all cross-correlation residuals after transforming the original time series into white noise (similar to spectral analysis of time series, or correlogram analysis). However, if you have thousands of metrics (and thus thousands of time series and thus dozens of millions of correlations), you WILL eventually find a very high correlation that is purely accidental. This is known as the curse of big data , and I will publish a note on this (with results based on simulations).

Correlation is not causation. Don't claim causation unless you can prove it. Many times, multiple inter-dependent factors contribute to a problem. Maybe the peak in crime happened when baby boomers (a less law-abiding generation) reached 20 years old. This is a more credible cause, in my opinion.

Related articles: