# Anonymous comments on 12 interesting things I learned studying the discovery of nature’s laws

• I will give a potted history of Pearl’s discovery as I understand it.

In the late 70s/​early 80s, people wanted to deal with uncertainty in logic-based AI. The obvious thing to use is probability, but doing a Bayesian update to compute a posterior is exponentially expensive.

Pearl wanted to come up with a good data structure for doing computations over probability distributions in less-than-exponential time.

He introduced the idea of Bayesian networks in his paper Reverend Bayes On Inference Engines where he represents factorized probability distributions using DAGs. Here, the direction of the arrows is arbitrary and there are many DAGs corresponding to one probability distribution.

He was not thinking about causality at all, it was just a problem in data structures. The idea was this would be used for the same sort of thing as an “expert system” or other logic based AI systems, but taking into account uncertainty expressed probabilistically.

Later, people including Pearl noticed that you can and often should interpret the arrows as causal, this amounts to choosing one DAG from many. The fact that there are many possible DAGs is related to the fact that there are seemingly always multiple incompatible causal stories, to explain observations absent making additional assumptions about the world. But if you pick one, you can start using it to see whether your causal question can be answered from observational data alone.

Finally, he realized that the assumptions encoded in a DAG aren’t sufficient for fully general counterfactuals, and realized that in full generality you have to specify exactly what functional relationship goes along each edge of the graph.

As someone originally concerned with AI, not with problems in the natural sciences, Pearl is probably unusual. Pearl himself looks back on Sewall Wright as his progenitor for coming up with path diagrams—he was working in genetics. If you are interested in this, you should also look at Don Rubin’s experience—his causal framework is isomorphic to Pearl’s. He was a 100 percent classic statistician, motivated by looking at medical studies.

• I think another important part of Pearl’s journey was that during his transition from Bayesian networks to causal inference, he was very frustrated with the correlational turn in early 1900s statistics. Because causality is so philosophically fraught and often intractable, statisticians shifted to regressions and other acausal models. Pearl sees that as throwing out the baby (important causal questions and answers) with the bathwater (messy empirics and a lack of mathematical language for causality, which is why he coined the do operator).

Pearl discusses this at length in The Book of Why, particularly the Chapter 2 sections on “Galton and the Abandoned Quest” and “Pearson: The Wrath of the Zealot.” My guess is that Pearl’s frustration with statisticians’ focus on correlation was immediate upon getting to know the field, but I don’t think he’s publicly said how his frustration began.

• Is Rubin’s work actually the same as Pearl’s??