I recently rediscovered these slides from a talk I gave back in 2007 and wanted to share them with you. For those of you who don’t know, Bayesian inference is certain way to approach learning from data and statistical inference. It’s named after Thomas Bayes, an English mathematician who lived in the 18th century.
The main idea (and please be kind with me, I’m not a Bayesian) of Bayes inference is to model your data and your expectations about the data using probability distributions. You write down a so-called generative model for the data, that is, what you expect the distribution of your data to be given its model parameters. Then, if you also specify your prior belief about the distribution of the parameters, you can derive an updated distribution over your parameters given observed data.
Bayesian inference has been applied to the whole range of inference problems, ranging from classification to regression to clustering, and beyond. The main inference step sketched above involves integrating (as in f(x)dx ot
There is a very silly (at least IMHO) divide within the field of statistics between Frequentists and Bayesians which I’ve discussed elsewhere.
In any case, the slides above discuss the very basics: Bayes rule, the role of the prior, the concept of conjugancy (combinations of model assumptions and priors which can be solved exactly, that is without requiring numerical integration) and pseudo-counts, and a bit of discussion on the Frequentism vs. Bayesian divide.