Learn Bayes Nets!

It recently occurred to me that there are a lot of people in the rationalist community who want to deeply absorb intuitions about how Bayes’ theorem works and how to think with it in practice, who have not been specifically told that learning inference algorithms for Bayesian networks is one of the best ways forward.

Well, I’m telling you now.

Bayesian networks were the innovation which made probabilistic reasoning really practical and interesting for artificial intelligence—and none of the reasons for that are special to trying to squeeze intelligence into a computer. They’re also, more or less, describing the way people have to think in order to do probabilistic reasoning in practice. There have been many innovations in probabilistic reasoning since Bayes nets, but those are arguably more about how to get good results on a computer and less about fundamental conceptual issues that you’ll get a lot from.

I would argue that the most important inference algorithm to learn about to get practical intuitions is belief propagation. There are others who would argue for monte carlo algorithms, like MCMC (monte carlo markov chain). You may want to learn both, to form your own opinion (and of course, there are many more algorithms beyond this which you may want to learn, in order to gain more connections so your knowledge of the field sticks, and gain more insights). Belief prop and MCMC are more or less the first two algorithms people thought of; there are a lot of newer developments, but they’re largely elaborations.

Here is what I claim you can get out of it, through careful study:

  • Understanding how bayesian networks define probability distributions, and how probabilities spread through the network via belief propagation, makes your understanding of Bayes’ theorem and probability theory in general much more “load bearing”—so it’ll break under the strain if it isn’t solid (which is a good thing).

  • It’ll give you a useful fake model of what you’re doing when you’re thinking. Global Bayesian updates don’t just happen; they result from (something like) local updates which you have to spread across your web of beliefs, partly through conscious attention and cogitation.

  • It’s also a useful analogy for aspects of group epistemics, like avoiding double counting as messages pass through the social network.

  • The messages which pass between nodes in belief prop fall into two types: probabilities and likelihoods. This is a deep truth; it’s the “dimensional analysis” of probabilistic reasoning. Several cognitive biases can be seen as confusion between probabilities and likelihoods, most centrally base-rate neglect.

  • Understanding probability vs likelihood messages also gives a nice general understanding of the way “prior” and “posterior” are local ideas which only make sense with respect to a “frame of reference”.

  • Bayesian networks also lay the foundation for a formal understanding of causality, if that’s something you’re interested in.

I think the best thing to read, to get up to speed, is the first four chapters of Pearl’s Probabilistic Reasoning in Intelligent Systems. It’s the original source; Pearl didn’t invent everything, but he invented a lot, and he’s the first who put it all together. There are better modern introductions for people who want to apply bayesian networks in machine learning, but because Pearl was writing at a time when the use of probability theory was not widely accepted in artificial intelligence, he goes into the philosophy of the subject in a way newer sources don’t. I think this is good for the LessWrong audience.

It would be even better, of course, if someone were to write a sequence explaining everything from a more specifically LessWrong perspective, drawing out the implications I mentioned above. Alas, I don’t have that much time to spend on writing (which is to say, I have other higher-value things to do, in my current estimation).

One might also derive a more general lesson on the relevance of algorithms to rationality, and go read Artificial Intelligence: A Modern Approach as a rationality textbook.