Saturday (8/​12/​23) Why Correlation Usually ≠ Causation—by Gwern

Saturday (8/​12/​23) Why Correlation Usually ≠ Causation—by Gwern

https://​​docs.google.com/​​document/​​d/​​1UIX9sW1lbmlpALyT4JyBdW-dSWOnoQcEJIQvt1Lpf88/​​edit?usp=sharing

Hello Folks!

We are excited to announce the 38th Orange County ACX/​LW meetup, happening this Saturday and most Saturdays thereafter.

Host: Michael Michalchik

Email: michaelmichalchik@gmail.com (For questions or requests)

Location: 1970 Port Laurent Place, Newport Beach, CA 92660

Date: Saturday, Aug 12, 2023

Time: 2 PM

Conversation Starters :

  • Why Correlation Usually ≠ Causation · Gwern.net

  • https://​​gwern.net/​​causality

  • Walk & Talk: We usually have an hour-long walk and talk after the meeting starts. Two mini-malls with hot takeout food are easily accessible nearby. Search for Gelson’s or Pavilions in the zip code 92660.

  • Share a Surprise: Tell the group about something unexpected that changed your perspective on the universe.

  • Future Direction Ideas: Contribute ideas for the group’s future direction, including topics, meeting types, activities, etc.

Here is a summary from Claude 2:

Here are detailed expansions of each summary point:

1. There is a widespread disconnect between statistical correlation and causation. People often overconfidently claim correlations imply favored causal interpretations, despite the adage “correlation does not equal causation.”

- In numerous fields, from medicine to sociology, researchers frequently imply or outright claim that observed statistical correlations support hypothesized causal relationships. This occurs despite the well-known dictum that “correlation does not imply causation.”

- There is ample evidence that people tend to be overconfident in such causal interpretations, expressing surprise when randomized experiments fail to confirm correlational findings. This suggests a systematic bias where correlations are taken as stronger evidence for causality than warranted.

- The ease with which spurious correlations can be generated, the replication crisis, and the poor track record of predictions call into question the value of correlations for inferring causation. Nevertheless, the alluring intuition persists.

2. The replication crisis shows a large fraction of research across fields is spurious and non-replicable. This can generate many false correlations through practices like p-hacking.

- The replication crisis refers to the finding that a surprisingly large percentage of published research fails to replicate when experiments are repeated. This implies a high rate of false positives and exaggerated results.

- Questionable research practices like p-hacking, fishing for low p-values through analysis choices, exacerbate the replication crisis. These can manufacture false correlations at will by capitalizing on random noise.

- Widespread irreproducible findings further undermine any certainty that an observed correlation reflects a real relationship rather than a spurious artifact.

3. With large enough data, “everything correlates” even variables with no plausible causal connection. Meehl noted correlations can seem arbitrary yet firmly established statistically.

- Given large datasets and sample sizes, many variables will demonstrate statistically-significant correlations with each other due to chance alone. Even variables expected to be causally unrelated correlate.

- Meehl described examples where absurd correlations reached high significance simply due to large n, concluding “everything correlates with everything else”. This casts doubt on meaningfulness of correlations generally.

4. In randomized experiments, efforts to change human behavior often fail, despite promising correlations (Rossi’s Metallic Rules). This suggests correlations poorly predict interventions.

- According to the empirical tendency known as Rossi’s Metallic Rules, the mean effect size of social interventions approaches zero as study quality improves. Even highly correlated predictors prove ineffective.

- More broadly, correlations from observational studies consistently prove poor guides to outcomes of randomized trials testing causal interventions. Real-world predictiveness does not match statistical significance.

5. Causal relationships are less common than correlations because complex causal networks have many more indirect connections than direct ones. Larger networks exacerbate this imbalance.

- Causal networks like biology or society contain dense webs of interrelated influence. The number of possible correlations scales much faster than direct causal links as networks grow.

- Most correlations are not caused by a direct relationship but by indirect “common cause” connections through a third mediating variable. Confounds overwhelm causation as causal graphs expand.

6. Intuitions about causality come from simple everyday domains and fail in complex networks like biology or society. Folk heuristics expect causal links to be normal and probable.

- Humans evolved causal reasoning for intuitive physics, folk psychology etc. These work well in simple environments. But complex networks violate hidden assumptions of sparsity.

- Normal intuitions view causation as relatively likely for a correlation. But in large graphs, causal connections are vastly outnumbered by indirect correlations.

7. To change intuitions, we need different mental models showing the prevalence of confounds. Visualizations of dense causal graphs could help. As could more emphasis on examples of correlational findings later refuted by experiments.

- Counterintuitive scientific findings require constructing new mental representations. We must internalize the ubiquity of confounding relative to causation.

- Concrete illustrations of dense causal networks, simulations, and salient counterexamples could help instill more accurate intuitions about the evidential value of correlations.

8. Persistent skepticism is required as correlations, however statistically significant, rarely imply causality and provide poor evidence for interventions.

- Claims of causality based on correlations should be viewed skeptically regardless of statistical significance, as predictive power for interventions is generally low.

- Given how frequently correlations fail to replicate or correspond to experimental results, high vigilance against overinterpreting them is warranted. Correlation only weakly supports causation.

Questions to think about:

Here are some thought-provoking discussion questions about correlation and causation for a graduate seminar:

- If most published correlations fail to replicate or predict randomized results, in what ways is correlation still useful scientifically? Should we emphasize prediction over explanation?

- How can researchers productively respond to the growing evidence that most of their statistical findings do not reflect genuine phenomena? What concrete changes should be made to incentives and methodology?

- If complex causal systems imply a vanishingly small prior probability that any given correlation is causal, how can correlation provide evidence at all? When is skepticism unwarranted denialism?

- How often do you notice causal language in papers based purely on correlation? Is this a case of subtle malpractice or an accepted convention? What harms does it risk?

- Why are researchers and laypeople so often surprised when correlations “fail”? Is it due to underestimating confounds, or misunderstanding correlation conceptually?

- How can statistics courses and textbooks better instill intuitions about confounding in causal networks and the limitations of correlation? What tools or frameworks would help?

- If most correlational findings prove spurious, irrelevant to interventions, or irreproducible, what does this imply about the value of past research relying on them? How much should be discounted?

- How confident are you in the effectiveness of current incentives, institutions and training for producing reliable science? How might reforms improve the situation? What obstacles stand in the way?

- If correlation does not imply causation beyond a bare minimum, what justifies claims that new policies should be “evidence-based” or “data-driven”? When is skepticism denialism?

No comments.