Steven Byrnes comments on Heritability: Five Battles

Steven Byrnes 26 May 2025 13:46 UTC
4 points
0
There was a part of the post where I wrote “I might well be screwing up the math here”, where I wasn’t sure whether to square something or not, and didn’t bother to sort it out. Anyway, I think this comment is a claim that I was doing it wrong, maybe? But that person is not very confident either, and anyway I’m not following their reasoning. Still hoping that someone will give a more definitive answer. I would love to correct the post if I’m wrong.
What links here?
- Excerpts from my neuroscience to-do list by Steven Byrnes (6 Oct 2025 21:05 UTC; 28 points)
- Joel Burget 11 Jan 2026 16:13 UTC
  1 point
  0
  Parent
  Let’s define three variables:
  - T is the true phenotype we’d like to measure,
  - X is a predictor of T (genes, polygenic score, etc), and
  - Y is the observed test score.
  In terms of the original quote: “I thought that if some input [X] explains X% of the variance of some outcome [T], then it explains $r^{2} X$ % of the variance of a noisy measurement of the outcome [Y].”
  So we know something about the relationship between X and T (the percentage of the variance in T that X explains), and something about the relationship between Y and T (the reliability of Y), and we’d like to use this to tell us something about the relationship between X and Y (how much of the variance in Y that X explains).
  First, we’ll show how we can use the test-retest correlation to estimate reliability. Model the observed phenotype as $Y = T + E$ with error E ( $E ⊥ T$ and $E ⊥ X$ ). Given two independent repeats $Y_{1} = T + E_{1}$ , $Y_{2} = T + E_{2}$ , $E_{1} ⊥ E_{2}$ , $E_{i} ⊥ T$ , and $E_{i} ⊥ X$ , then $C o v (Y_{1}, Y_{2}) = C o v (T + E_{1}, T + E_{2}) = V a r (T)$ . Also $V a r (Y_{1}) = V a r (Y_{2}) = V a r (Y)$ . So $c o r r (Y_{1}, Y_{2}) = \frac{C o v (Y_{1}, Y_{2})}{σ_{Y_{1}} σ_{Y_{2}}} = V a r (T) / V a r (Y)$ . This ( $V a r (T) / V a r (Y)$ ) is the definition of reliability (under the classical test theory model with parallel forms / independent errors), so we can use the test-retest correlation to estimate reliability, $R e l (Y) \approx 0.55$ .
  Now, $C o v (X, Y) = C o v (X, T + E) = C o v (X, T) + C o v (X, E) = C o v (X, T)$ since $X ⊥ E$ .
  From which we can see that $c o r r (X, Y) = \frac{C o v (X, Y)}{σ_{X} σ_{Y}} = \frac{C o v (X, T)}{σ_{X} σ_{Y}} = \frac{C o v (X, T)}{σ_{X} σ_{T}} \frac{σ_{T}}{σ_{Y}} = c o r r (X, T) \sqrt{\frac{V a r (T)}{V a r (Y)}} = c o r r (X, T) \sqrt{R e l (Y)}$ .
  Square both sides of that equation to get $c o r r (X, Y)^{2} = c o r r (X, T)^{2} R e l (Y)$ . Since we only have a single predictor, $R^{2} = c o r r (X, Y)^{2}$ , so this is equivalent to $R_{X \to Y}^{2} = R_{X \to T}^{2} R e l (Y)$ (though apparently a similar argument also works to derive the same result with multiple predictors). So the attenuation correction factor is $R e l (Y)$ .
  If $R_{X \to T}^{2} = 0.50$ and $R e l (Y) \approx 0.55$ , then $R_{X \to Y}^{2} \approx 0.275$ , not $0.50 \times {0.55}^{2} \approx 0.151$ .
  So, in this case the attenuation factor is just the correlation. When would it have been the squared correlation (like you originally had)? The key was that we started with the test-retest correlation $c o r r (Y_{1}, Y_{2})$ and we showed that $R e l (Y) = c o r r (Y_{1}, Y_{2})$ . If what we had instead was $c o r r (T, Y)$ , then $c o r r (T, Y) = \sqrt{\frac{V a r (T)}{V a r (Y)}} = \sqrt{R e l (Y)}$ , and the attenuation factor would instead have been $c o r r (T, Y)^{2}$ .