Rubin’s framework says basically, suppose all our observations are in a big data table. Now consider the counterfactual observations that didn’t happen (i.e. people in the control group getting the treatment) -- these are called “potential outcomes”—treat those like missing cells in the data table. Then causal inference is just to fill in potential outcomes using missing data imputation techniques, although to be valid these require some assumptions about conditional independence.

Pearl’s framework and Rubin’s are isomorphic in the sense that any set of causal assumptions in Pearl’s framework (a structural causal model, which has a DAG structure), can be translated into a set of causal assumptions in Rubin’s framework (a bunch of conditional independence assumptions about potential outcomes), and vice versa. This is touched on somewhat in Ch. 7 of “Causality”.

Pearl argues that despite this equivalence, his framework is superior because it’s a better tool for thinking. In other words, writing down your assumptions as DAG/SCM is intuitive and can be explained and argued about, while he claims the Rubin model independence assumptions are opaque and hard to understand.

From my experience it pays to learn how to think about causal inference like Pearl (graphs, structural equations), and also how to think about causal inference like Rubin (random variables, missing data). Some insights only arise from a synthesis of those two views.

Pearl is a giant in the field, but it is worth remembering that he’s unusual in another way (compared to a typical causal inference researcher) -- he generally doesn’t worry about actually analyzing data.

---

By the way, Gauss figured out not only the normal distribution trying to track down Ceres’ orbit, he actually developed the least squares method, too! So arguably the entire loss minimization framework in machine learning came about from thinking about celestial bodies.

Is Rubin’s work actually the same as Pearl’s??

Please tell more?

That’s not the impression from reading Pearl s causality. If so, seems like a major omission of scholarship

Rubin’s framework says basically, suppose all our observations are in a big data table. Now consider the counterfactual observations that didn’t happen (i.e. people in the control group getting the treatment) -- these are called “potential outcomes”—treat those like missing cells in the data table. Then causal inference is just to fill in potential outcomes using missing data imputation techniques, although to be valid these require some assumptions about conditional independence.

Pearl’s framework and Rubin’s are isomorphic in the sense that any set of causal assumptions in Pearl’s framework (a structural causal model, which has a DAG structure), can be translated into a set of causal assumptions in Rubin’s framework (a bunch of conditional independence assumptions about potential outcomes), and vice versa. This is touched on somewhat in Ch. 7 of “Causality”.

Pearl argues that despite this equivalence, his framework is superior because it’s a better tool for thinking. In other words, writing down your assumptions as DAG/SCM is intuitive and can be explained and argued about, while he claims the Rubin model independence assumptions are opaque and hard to understand.

Some reading on this:

https://csss.uw.edu/files/working-papers/2013/wp128.pdf

http://proceedings.mlr.press/v89/malinsky19b/malinsky19b.pdf

https://arxiv.org/pdf/2008.06017.pdf

—

From my experience it pays to learn how to think about causal inference like Pearl (graphs, structural equations), and

alsohow to think about causal inference like Rubin (random variables, missing data). Some insights only arise from a synthesis of those two views.Pearl is a giant in the field, but it is worth remembering that he’s unusual in another way (compared to a typical causal inference researcher) -- he generally doesn’t worry about actually analyzing data.

---

By the way, Gauss figured out not only the normal distribution trying to track down Ceres’ orbit, he actually developed the least squares method, too! So arguably the entire loss minimization framework in machine learning came about from thinking about celestial bodies.

Aha, I will have to ponder on this for a while. Thanks a lot!