KaseyMarkel

Karma: 7

KaseyMarkel 22 Feb 2026 23:20 UTC
4 points
0
in reply to: Richard_Kennaway’s comment on: Correlation Does in Fact Imply Causation
Thank you for providing counterexamples! Quite useful. and convinces me that the broadest version of this is false. And apologies for the slow reply—the linked papers took a while to absorb (as you may guess, I’m not a mathematician).

1. I tried for a while, but couldn’t properly follow this paper. From reading the abstract, goung through the figures, your summary, and discussing it with Claude (Opus 4.6), I think the core issue here is the same problem as point 2 - the measures are not independent. My wife’s phrasing here is that “there’s a causal link between position at time t and position at time t+1” (in this case + noise, in the case of point 2 plus velocity times time between measurements).

2. Thank you—slightly embarrassed I didn’t find this one myself. I think the issue here is due to repeated measurement—the moon-earth distance now and at t+1s are obviously connected. I think the missing criteria here is independence—repeatedly measuring the same thing to see evolution over time breaks it. This requires some narrowing of the core thesis—my proposed added text is “I’m talking about correlations that survive proper statistical scrutiny—where the p-value is computed correctly for the data structure at hand. A naive Pearson correlation between two autocorrelated time series, or two monotonic trends, can look impressive while meaning nothing, but that’s not a real correlation any more than a loaded die produces a real test of probability. The well-known tools for handling this (differencing, detrending, cointegration tests, correcting for effective sample size) exist precisely because statisticians already understand that autocorrelation inflates apparent correlation. My claim applies to correlations that remain after you’ve done the statistics right.”

I think the claim “correlation of independent measures implies causation” is still interesting and surprising (probably not to this crowd who point out plenty of more rigorous prior work, but to most biologists at least), though a bit less exciting than my original claim. In particular, it’s less exciting because it doesn’t have a bulletproof checklist for “doing the statistics right”, which may or may not be possible to make.

3. This paper I could understand. I’ve worked through all their examples, and don’t think any are counterexamples to my point. Indeed causation can exist without correlation (as you say, very common in biology and technology), I’ve never claimed otherwise. The paper shows some examples where correlation is high despite the causal path containing intermediates with lower correlation which is intriguing, but I believe every correlation in the paper is explained by a causal path that links the two things correlated. As you say, not necessarily a direct causal connection, but still a “relatively short causal chain linking those things”—an indirect causal connection.

My favorite quote from the paper was “the simple maxim that ‘correlation does not imply causation’ having been superceded by methods such as those set out in [9, 14], and in shorter form in [10]”. Good to see that others (Pearl in addition to Reichenbach) have made something like the point I’m aiming to make here, as well as the restriction from time-series data that you brought to my attention in (2) - “cited limit attention to systems whose causal connections form a directed acyclic graph, together with certain further restrictions, and also do not consider dynamical systems or time series data.”

So far I don’t think cyclic systems cause any correlations between causally-unlinked variables, but the fact that people doing this formally haven’t solved it makes me hesitant to make any claims as an outsider to the field.

KaseyMarkel 22 Feb 2026 21:59 UTC
1 point
0
in reply to: Darmani’s comment on: Correlation Does in Fact Imply Causation
Thanks for a detailed reply!
p-values and priors do not mix

I guess this is what I get for my statistics-knowledge being an odd mongrel mix of Bayesian vibes gathered from LessWrong and frequentist stats from biology grad school. You seem much more knowledgeable on the formalizations so I trust you’re right that they can’t formally mix, but informally to me it seems like there must be some way to mix them—fundamentally, this is just “extraordinary claims require extraordinary evidence”. To use an example from my day job: if I’m testing whether knocking out a candidate gene that I think will increase tryptophan in a plant tissue does indeed increase the tryptophan, p < 0.05 feels like a fine cutoff. If I’m testing whether that same KO causes my plants to communicate with me telepathically, I’d be crazy to tell anyone about my results unless I was seeing p < 0.001 in at least two independent experiments.

The difference in required p-value threshold to me seems to come down to the prior. Perhaps there’s no formal framework that combines them, but empirically I think that’s what I’m doing.
The mathematics of causality is still not part of a standard stats curriculum
.
You seem to be correct here, but it strikes me as strange because quantifying the evidence for causality was one of the central themes of most of my stats classes (with names like “intro statistics” and “statistical design of experiments”).

Possibly the synthesis here is that most of what non-math people learn doesn’t qualify as real math—I only took the super basic stuff that doesn’t get into the actual math of causality (despite minoring in math in undergrad and taking more math than most in a bio PhD).
I would like there to be a formal way to render both DNA and dominoes as a short causal chain. But I don’t have one.
Thinking about it a bit more, I think I’m happy to bite the bullet here and say the causal chain is long (though well-approximated by both of our very short descriptions), and that this is one of the exceptions to the point in footnote 4 that real-world correlations are generally well below 1. The causal chain is quite long, but DNA replication is extremely good—something like 10^-7 errors per base of DNA per generation—pretty dang close to 1 (my guess is domino setups by hobbyists also have failure rates under 10^-3). I’m sure we could find a handful more examples of things with long chains of very high correlation, but not that many—pretty few in biology are over 0.95.

A classic example from Pearl’s 2009 book
Thanks, that was fun to work out, and appropriately simple for a humble biologist. In the 2x2 matrix of “corr or not” and “causal linkage or not”, this is the opposite square to the one I’m looking for—causal but not correlated. I agree that such things happen (they happen a lot in biology due to regulatory feedback loops), and I now see that this is indeed a non-faithful network.

Is there a similar toy example for “corr but no causal linkage”, e.g. “two variables on opposite sides of the program whose value is correlated for no good reason”? I spent a while asking Claude Opus 4.6 for one and it couldn’t come up with any (it came up with a toy where two variables would always be the same constant, but correlation there is undefined so I don’t count it).

KaseyMarkel 19 Feb 2026 16:48 UTC
1 point
0
in reply to: TFD’s comment on: Correlation Does in Fact Imply Causation
Congrats on always interpreting the phrase that way! Of the folks I’ve discussed this with in person, every one of them had the interpretation that things can be correlated without either one causing the other or a common cause.

I would be surprised if “correlation does’t imply causation” would have become so popular if most people interpreted it as strictly expanding the options from X → Y or Y → X to include common cause (and now might ask a stats-professor friend to run a poll to see which of these interpretations most people hold in their intro stats class), but I think your interpretation is correct. I certainly don’t want to attack a strawman version of the phrase, but if >30% of people interpret it that way I’ll conclude it’s not a strawman.

KaseyMarkel 19 Feb 2026 16:40 UTC
1 point
0
in reply to: Darmani’s comment on: Correlation Does in Fact Imply Causation
Thanks for a long and detailed comment! tl;dr—I wish I’d kept some of the more detailed footnotes from an earlier draft of the essay.
the p-value part… type error
My phrasing there was insufficiently precise. You’re right to call out that (1- p-value) isn’t the probability of coincidence, because that’s ignoring the prior. I should have gone with something vaguer like “with the evidentiary strength you would expect based on the p-value of the correlation”.

I do more or less interpret p-values as probabilities, but in a world of theory rather than the real world—the summary I generally give is “probability of getting a result this extreme or more under the null hypothesis”, does that still seem like a type error?
The basic version of this is Reichenbach’s principle and is well-known
I came across this SEP article while researching the essay, thinking surely I couldn’t be the first to think of the idea. Ultimately I cut the section about it because a) I found the SEP article fairly long and confusing and thought I could explain it more simply without referencing it and b) having discussed this point with dozens of people, including many with degrees in math and physics, exactly zero have ever brought up Reichenbach’s common cause, which led me to believe it wasn’t very well known (I’d guess at least 2 OOMs fewer people are familiar with that compared to the classic injunction “correlation does not imply causation”). That said, you’ve convinced me to add the footnote referencing Reichenbach’s common cause back in, thank you.
As the size of a causal network grows, the set of correlations grows far faster than the set of causal relationships, until almost all correlations become spurious.
This is another point featured in an earlier version of the essay, inspired by the spurious correlations charts. Correlations between thousands of variables indeed grow extremely quickly, to me this is just an argument for correctly adjusting for multiple comparisons—pretty much all the p-values there are reported as <0.01, but with tens of thousands of comparisons of course you’d find a ton of those by chance and need to adjust for that—my claim is that very few of those would remain interesting after such an adjustment.
set of nucleotides used by humans and by jellyfish is explained by an extremely long causal chain.
This is the point about dominoes in footnote 4, I can necker-cube between the two views of it being an extremely long causal chain or a very short one, “they share common ancestry”. I find the shorter view generally easier to reason about and it’s the one I use most of the time.
https://people.math.ethz.ch/~peterbu/Files/Manuscripts/strong-faithfulness-aos.pdf
Can you recommend any non-technical summary of or examples of non-faithful networks? The closes semantic match I found after a couple iterations of search was this paper, but it’s denser than I prefer. The core point