I think you’ve missed an important piece of this picture, or perhaps have not emphasized it as much as I would. The real real reason we can elucidate causation from correlation is that we have a prior that prefers simple explanations over complex ones, and so when some observed frequencies can be explained by a compact (simple) bayes net we take the arrows in that bayes net to be causation.
A fully connected bayes net (or equivalently, a causal graph with one hidden node pointing to all observed nodes) can represent any probability distribution whatsoever. Such a Bayes net can never be flat-out falsified. Rather it is our preference for simple explanations that sometimes gives us reason to infer structure in the world.
This contradicts nothing you’ve said, but I guess I read this article as suggesting there is some fundamental rule that gives us a crisp method for extracting causation from observations, whereas I would look at it as a special case of inference-with-prior-and-likelihood, just like in other forms of Bayesian reasoning.
I think you’ve missed an important piece of this picture, or perhaps have not emphasized it as much as I would. The real real reason we can elucidate causation from correlation is that we have a prior that prefers simple explanations over complex ones, and so when some observed frequencies can be explained by a compact (simple) bayes net we take the arrows in that bayes net to be causation.
A fully connected bayes net (or equivalently, a causal graph with one hidden node pointing to all observed nodes) can represent any probability distribution whatsoever. Such a Bayes net can never be flat-out falsified. Rather it is our preference for simple explanations that sometimes gives us reason to infer structure in the world.
This contradicts nothing you’ve said, but I guess I read this article as suggesting there is some fundamental rule that gives us a crisp method for extracting causation from observations, whereas I would look at it as a special case of inference-with-prior-and-likelihood, just like in other forms of Bayesian reasoning.