In the previous post, I discussed the philosophical differences between frequentism and Bayesianism, and showed that, despite these differences, they often give the same result for simple problems. Here we've explored one category of problem where the results begin to diverge: accounting for nuisance parameters.

For Bayes' billiard ball example, we showed that a naïve frequentist approach leads to the wrong answer, while a naïve Bayesian approach leads to the correct answer. This doesn't mean frequentism is wrong, but it does mean we must be very careful when applying it.

For the linear regression example, we showed one possible approach from both frequentism and Bayesianism for accounting for outliers in our data. Using a robust frequentist cost function is relatively fast and painless, but is dubiously motivated and leads to results which are difficult to interpret. Using a Bayesian mixture model takes more effort and requires more intensive computation, but leads to a very nice result in which multiple questions can be answered at once: in this case, marginalizing one way to find the best-fit model, and marginalizing another way to identify outliers in the data.

So which is better, frequentism or Bayesianism?

The answer probably depends on your level of expertise in frequentist and Bayesian methods, as well as the size of your problem and your available computational resources. I, like many with a Physics background, tend to lean toward Bayesian methods partly because they appeal to my desire to be able to derive anything from fundamental principles. With Bayesianism, based on algebraic manipulation of a few probability axioms, we can construct extremely flexible methods to address a wide variety of problems. Just as the conservation of mass-energy can be applied to everything from projectile motion to stellar structure, Bayes' rule and the Bayesian probability interpretation can be applied to solve virtually any statistical problem: from computing gambling odds to detecting exoplanet transits in noisy photometric data. In a Bayesian paradigm, you need not spend years memorizing and understanding obscure frequentist techniques and jargon (p-values, null tests, confidence intervals, breakdown points, etc.) in order to do such nontrivial analyses. It's a common misconception, I think: people imagine that Bayesian analysis is hard. On the contrary, many scientists use it just because it's easy!

Ease and aesthetics aside, though, there's a further important reason that I sit firmly in the Bayesian camp, and that has to do with the interpretation of frequentist confidence intervals and Bayesian credibility regions within the context of scientific data. As this post is already way too long, that discussion will have to wait for next time.

This post was written entirely in the IPython notebook. You can download this notebook, or see a static view on nbviewer.