I’m using the term “sleeper” here for a theorem that is far more important than it seems, something that you may not appreciate for years after you first see it.

Bayes’ theorem

The first such theorem that comes to mind is Bayes’ theorem. I remember being unsettled by this theorem when I took my first probability course. I found it easy to prove but hard to understand. I couldn’t decide whether it was trivial or profound. Then years later I found myself using Bayes theorem routinely.

The key insight of Bayes theorem is that it gives you a way to turn probabilities around. That is, it lets you compute the probability of A given B from the probability of B given A. That may not seem so important, but it’s vital in application. It’s often easy to compute the probability of data given an hypothesis, but we need to know the probability of an hypothesis given data. Those unfamiliar with Bayes theorem often get probabilities backward.

Jensen’s inequality

Another sleeper theorem is Jensen’s inequality: If φ is a convex function and X is a random variable, φ( E(X) ) ≤ E( φ(X) ). In words, φ at the expected value of X is less than the expected value of φ of X. Like Bayes’ theorem, it’s a way of turning things around. If the convex function φ represents your gain from some investment, Jensen’s inequality says that randomness is good for you; variability in X is to your advantage on average. But if φ is concave, variability works against you.

Sam Savage’s book The Flaw of Averages is all about the difference between φ( E(X) ) and E( φ(X) ). When φ is linear, they’re equal. But in general they’re different and there’s not much you can say about the relation of the two. However, when φ is convex or concave, you can say what the direction of the difference is.

I’ve just started reading Nassim Taleb’s new book Antifragile, and it seems to be an extended meditation on Jensen’s inequality. Systems with concave returns are fragile; they are harmed by variability. Systems with convex returns are antifragile; they benefit from variability.

Other examples

What are some more examples of sleeper theorems?

Related posts