The causality looks like something. An obscure common cause is the most obvious (to me) source of the correlation if no one’s put forth a plausible causal relationship yet. I’m not sure what you mean, though, by “causal relationship between X and Y”… do you mean specifically a relationship of the form “X → ..Z.. → Y” / “X ← ..Z.. ← Y”, or do you mean “any causal structure connecting X and Y in any way”?
(Are you interested in some specific X,Y but phrasing it generally so we don’t get distracted? I feel like seeing some examples of the failed tests run by statistically competent scholars would help me know what they haven’t ruled out)
I didn’t find that clear from your article. A correlation between X and Y tells you no more than that causality is present somewhere. It tells you absolutely nothing about whether X causes Y, Y causes X, Z causes X and Y, how long the causal chains are, or whether it’s a sampling artefact due to common effects of X and Y.
Or exhaustive. Imperfect sampling can produce sample correlations among variables with no causal connection. (Toy example: X and Y are independent, Z is jointly caused by X and Y and is equal to X+Y, and everyone is unwittingly sampling from a subpopulation with a narrow range of values of Z. Sample X and Y will have a high negative correlation.)
A real one? Not off hand, not being a statistician, but sampling bias is a standard problem that has to be guarded against in statistical investigations. It can affect not just the sample means of variables, but correlations and indeed every statistic whatsoever.
To flesh out the toy example with an imaginary narrative, suppose X = intelligence, Y = effort, and Z = exam grade. Suppose Z is highly correlated with X+Y. If we divide the population up by exam grade, we may find that in every subpopulation, X and Y are negatively correlated, even while in the whole population, X and Y are uncorrelated.
The causality looks like something. An obscure common cause is the most obvious (to me) source of the correlation if no one’s put forth a plausible causal relationship yet. I’m not sure what you mean, though, by “causal relationship between X and Y”… do you mean specifically a relationship of the form “X → ..Z.. → Y” / “X ← ..Z.. ← Y”, or do you mean “any causal structure connecting X and Y in any way”?
(Are you interested in some specific X,Y but phrasing it generally so we don’t get distracted? I feel like seeing some examples of the failed tests run by statistically competent scholars would help me know what they haven’t ruled out)
I’m mostly interested in whether X causes Y vs. whether some Z causes both X and Y.
I didn’t find that clear from your article. A correlation between X and Y tells you no more than that causality is present somewhere. It tells you absolutely nothing about whether X causes Y, Y causes X, Z causes X and Y, how long the causal chains are, or whether it’s a sampling artefact due to common effects of X and Y.
Those options aren’t mutually exclusive...
Or exhaustive. Imperfect sampling can produce sample correlations among variables with no causal connection. (Toy example: X and Y are independent, Z is jointly caused by X and Y and is equal to X+Y, and everyone is unwittingly sampling from a subpopulation with a narrow range of values of Z. Sample X and Y will have a high negative correlation.)
Could you give a concrete example of such sampling bias?
A real one? Not off hand, not being a statistician, but sampling bias is a standard problem that has to be guarded against in statistical investigations. It can affect not just the sample means of variables, but correlations and indeed every statistic whatsoever.
To flesh out the toy example with an imaginary narrative, suppose X = intelligence, Y = effort, and Z = exam grade. Suppose Z is highly correlated with X+Y. If we divide the population up by exam grade, we may find that in every subpopulation, X and Y are negatively correlated, even while in the whole population, X and Y are uncorrelated.