Nate Silver, baseball statistician turned political analyst, gained a lot of attention during the 2012 United States elections when he successfully predicted the outcome of the presidential vote in all 50 states. The reason for his success was a statistical method called Bayesian inference, a powerful technique that builds on prior knowledge to estimate the probability of a given event happening.

Bayesian inference grew out of Bayes' theorem, a mathematical result from English clergyman Thomas Bayes, published two years after his death in 1761. In honor of the 250th anniversary of this publication, Bradley Efron examined the question of why Bayes' theorem is not more widely used—and why its use remains controversial among many scientists and statisticians. As he pointed out, the problem lies with blind use of the theorem, in cases where prior knowledge is unavailable or unreliable.

As is often the case, the theorem ascribed to Bayes predates him, and Bayesian inference is more general than what the good reverend worked out in his spare time. However, Bayes' posthumous paper was an important step in the development of probability theory, and so we'll stick with using his name.

Bayes' theorem in essence states that the probability of a given hypothesis depends both on the current data and prior knowledge. In the case of the 2012 United States election, Silver used successive polls from various sources as priors to refine his probability estimates. (In other words, saying he "predicted" the outcome of the election is slightly misleading: he calculated which candidate was most likely to win in each state based on the polling data.) In other cases, priors could be the outcome of earlier experiments or even educated assumptions drawn from experience. The wise statistician or scientist constructs priors that are informative, but that isn't always easy to do.

Partly for that reason, many who work with statistics and probability reject the use of prior data at all. Stats geeks refer to this as the "frequentist" approach, but if you learned it in a formal setting, you probably just called it "statistics." The basis of frequentist reasoning is a prediction of the outcome of many repetitions of the same test, providing an estimate of how frequently a particular result will show up. That's arguably a more objective approach, since it sidesteps the problem Bayesians have when there isn't an obvious prior. However, as Efron pointed out, when there are reasonable prior data—especially from disparate sources, as with Nate Silver's analysis—the Bayesian method performs better than the frequentist approach.

While zealots exist in both the Bayesian and frequentist camps, scientists are often pragmatic, picking a method based on the particular problem at hand. Efron, himself a sophisticated Bayesian, admitted that he uses pure Bayesian methods only when the data allows him to, and he does some frequentist double-checking when priors aren't available.

A third approach, called "empirical Bayes", can be used when the data set is large enough to act as a sort of prior in its own right. In this case, there are enough experimental outcomes within a single trial to provide a "prior" to test a hypothesis against one specific data point in the set. Using the entire data set, empirical Bayes can provide an estimate for the probability of a given outcome within the set.

As Efron wrote, Bayes' theorem is "controversial," but not because the equation itself is in doubt. (It's a mathematical theorem, after all; it even has an interpretation in frequentist thinking, albeit one that doesn't make reference to prior knowledge.) Rather, its use is sometimes controversial, especially in light of unsophisticated application of poorly chosen priors. Nevertheless, the real lesson is that of the sharp kitchen knife that can cut you as well as the vegetables you're chopping: use the blade of Bayes poorly, and you'll regret it. Use it wisely, and it will serve you better than the dull but reliable knife.

Science, 2013. DOI: 10.1126/science.1236536 (About DOIs).