In this blog post I will talk about:

How the Bayesian Revolution in many scientific disciplines is hindered by poor usability of current Probabilistic Programming languages.

A gentle introduction to Bayesian linear regression and how it differs from the frequentist approach.

A preview of PyMC3 (currently in alpha) and its new GLM submodule I wrote to allow creation and estimation of Bayesian GLMs as easy as frequentist GLMs in R.

Ready? Lets get started!

There is a huge paradigm shift underway in many scientific disciplines: The Bayesian Revolution.

While the theoretical benefits of Bayesian over Frequentist stats have been discussed at length elsewhere (see Further Reading below), there is a major obstacle that hinders wider adoption -- usability (this is one of the reasons DARPA wrote out a huge grant to improve Probabilistic Programming).

This is mildly ironic because the beauty of Bayesian statistics is their generality. Frequentist stats have a bazillion different tests for every different scenario. In Bayesian land you define your model exactly as you think is appropriate and hit the Inference Button(TM) (i.e. running the magical MCMC sampling algorithm).

Yet when I ask my colleagues why they use frequentist stats (even though they would like to use Bayesian stats) the answer is that software packages like SPSS or R make it very easy to run all those individuals tests with a single command (and more often then not, they don't know the exact model and inference method being used).

While there are great Bayesian software packages like JAGS, BUGS, Stan and PyMC, they are written for Bayesians statisticians who know very well what model they want to build.

Unfortunately, "the vast majority of statistical analysis is not performed by statisticians" -- so what we really need are tools for scientists and not for statisticians.

In the interest of putting my code where my mouth is I wrote a submodule for the upcoming PyMC3 that makes construction of Bayesian Generalized Linear Models (GLMs) as easy as Frequentist ones in R.

Linear Regression

While future blog posts will explore more complex models, I will start here with the simplest GLM -- linear regression. In general, frequentists think about Linear Regression as follows:

\[ Y = X\beta + \epsilon \]

where \(Y\) is the output we want to predict (or dependent variable), \(X\) is our predictor (or independent variable), and \(\beta\) are the coefficients (or parameters) of the model we want to estimate. \(\epsilon\) is an error term which is assumed to be normally distributed.

We can then use Ordinary Least Squares or Maximum Likelihood to find the best fitting \(\beta\).

Probabilistic Reformulation

Bayesians take a probabilistic view of the world and express this model in terms of probability distributions. Our above linear regression can be rewritten to yield:

\[ Y \sim \mathcal{N}(X \beta, \sigma^2) \]

In words, we view \(Y\) as a random variable (or random vector) of which each element (data point) is distributed according to a Normal distribution. The mean of this normal distribution is provided by our linear predictor with variance \(\sigma^2\).

While this is essentially the same model, there are two critical advantages of Bayesian estimation: