Daniel_Burfoot comments on Causal Diagrams and Causal Models

Daniel_Burfoot 12 Oct 2012 17:57 UTC
40 points
0
After reading this post I was stunned. Now I think the central conclusion is wrong, though I still think it is a great post, and I will go back to being stunned if you convince me the conclusion is correct.

You’ve shown how to identify the correct graph structure from the data. But you’ve erred in assuming that the directed edges of the graph imply causality.

Imagine you did the same analysis, except instead of using O=”overweight” you use W=”wears size 44 or higher pants”. The data would look almost the same. So you would reach an analogous conclusion: that wearing large pants causes one not to exercise. This seems obviously false unless your notion of causality is very different from mine.

In general, I think the following principle holds: inferring causality requires an intervention; it cannot be discovered from observational data alone. A researcher who hypothesized that W causes not-E could round up a bunch of people, have half of them wear big pants, observe the effect of this intervention on exercise rates, and then conclude that there is no causal effect.
- IlyaShpitser 12 Oct 2012 20:45 UTC
  35 points
  0
  Parent
  You are correct—directed edges do not imply causality by means of only conditional independence tests. You need something called the faithfulness assumption, and additional (causal) assumptions, that Eliezer glossed over. Without causal assumptions and with only faithfulness, all you are recovering is the structure of a statistical, rather than a causal model. Without faithfulness, conditional independence tests do not imply anything. This is a subtle issue, actually.
  
  There is no magic—you do not get causality without causal assumptions.
  - eurg 20 Oct 2012 22:55 UTC
    2 points
    0
    Parent
    Is this another variation of the theme that one needs to assume the possibility of inductive reasoning to make an argument for it (or also assume Occam’s Razor to argue for it)? Also, the specific example he gave seems to me like an instance of “given very skewed data, the best guesses are still wrong” (there was sometime a variation of that here, regarding bets and opponents who have superior information). Or are you thinking of something for subtle?
    - IlyaShpitser 31 Oct 2012 18:24 UTC
      9 points
      0
      Parent
      Even if you assume that we can do induction (and assume faithfulness!), conditional independence tests simply do not select among causal models. They select among statistical models, because conditional independences are properties of joint distributions (statistical, rather than causal objects). Linking those joint distributions with something causal relies on causal assumptions.
      
      I think the biggest lesson to learn from Pearl’s book is to keep statistical and causal notions separate.
      - eurg 5 Nov 2012 14:21 UTC
        0 points
        0
        Parent
        Thanks for clarifying!
- Houshalter 19 Aug 2015 0:33 UTC
  2 points
  0
  Parent
  He addressed that in the third footnote.
  
  Or there might be some hidden third factor, a gene which causes both fat and non-exercise. By Occam’s Razor this is more complicated and its probability is penalized accordingly, but we can’t actually rule it out. It is obviously impossible to do the converse experiment where half the subjects are randomly assigned lower weights, since there’s no known intervention which can cause weight loss.
  
  The model assumes that those are the only relevant variables. Given that assumption, we can prove that weight causes exercise. And that it can’t be the other way around.
  
  If there are unobserved variables, it’s possible that they can cause weight and cause exercise. However that wasn’t one of the hypotheses anyone believed beforehand; they were arguing whether weight causes exercise or if exercise causes weight.
  
  Second, even if there is an unobserved variable, it still suggests that exercising more will not improve your weight. Otherwise internet use would correlate with weight. Because internet use affects exercise. If exercise affected weight at all, then internet use would indirectly cause weight gain, and therefore correlate with it.
  
  The whole point of the article is about this trick. Where taking a weird and unrelated variable like internet use, lets us discover the direction of causation. Which according to common knowledge about statistics, shouldn’t be possible. Not without randomized controlled experiments.
- Matt Vincent 14 Jun 2022 13:43 UTC
  1 point
  0
  Parent
  I’m sorry if I’m just being too much of a dodo to perceive the mystery, but your scenario seems easily accounted for. You can use a Bayesian network to infer causality if and only if you have valid data to fill it with. Of course wearing large pants does not cause one not to exercise, but no real set of data would indicate that it did. Am I missing something?
  EDIT: shortly after writing this, I read up on faithfulness and Milton Friedman’s thermostat, so the “if and only if” part of my comment isn’t quite accurate. Still, the pants size scenario doesn’t seem like one of these exceptional cases.
- jimrandomh 12 Oct 2012 18:14 UTC
  0 points
  0
  Parent
  In this case, the true structure would be O->E, O->W, I->E. If O is unobserved, then you confuse a fork for an arrow, but I’m not sure you can actually get an arrow pointing the wrong way just by omitting variables.