The Bayesian approach allows for a coherent framework for dealing with uncertainty in machine learning. By integrating out parameters, Bayesian models do not suffer from overfitting, thus it is conceivable to consider models with infinite numbers of parameters, aka Bayesian nonparametric models. An example of such models is the Gaussian process, which is a distribution over functions used in regression and classification problems. Another example is the Dirichlet process, which is a distribution over distributions. Dirichlet processes are used in density estimation, clustering, and nonparametric relaxations of parametric models. It has been gaining popularity in both the statistics and machine learning communities, due to its computational tractability and modelling flexibility.

In the tutorial I shall introduce Dirichlet processes, and describe different representations of Dirichlet processes, including the Blackwell-MacQueen? urn scheme, Chinese restaurant processes, and the stick-breaking construction. I shall also go through various extensions of Dirichlet processes, and applications in machine learning, natural language processing, machine vision, computational biology and beyond.

In the practical course I shall describe inference algorithms for Dirichlet processes based on Markov chain Monte Carlo sampling, and we shall implement a Dirichlet process mixture model, hopefully applying it to discovering clusters of NIPS papers and authors.