small identity comments on Heritability: Five Battles

small identity 28 Dec 2025 0:08 UTC
11 points
0
All methodology is from the first section of the appendix of the linked paper. The paper cited pages 81-87 of Genetics and Analysis of Quantitative Traits. I read from chapter 4 up until those pages to understand the method conceptually. Every niceness assumption is made except for “no shared environment.” For example, “no assortative mating.”

Changing some notation: $r_{M Z} = S + 1$ , i.e. we normalize so that the total “variance due to genes” is 1. We assume that the variance due to shared environment is the same for twins and non-twins, $S_{M Z} = S_{D Z} = S .$ This is a standard assumption in ACE, and it seems reasonable. $R^{2}$ will represent the “nonlinear part of the effect due to genes,” i.e. that due to epistasis and dominance. $V_{A} = 1 - R^{2}$ is the effect due to alleles, what you called $s$ .
Facts:
$2 (r_{M Z} - r_{D Z} - \frac{1}{2}) < R^{2} \leq 4 (r_{M Z} - r_{D Z} - \frac{1}{2})$ always. (1)

When $S = 0$ :
$2 (\frac{1}{2} - \frac{r_{D Z}}{r_{M Z}}) < R^{2} \leq 4 (\frac{1}{2} - \frac{r_{D Z}}{r_{M Z}})$ . (2)

Note that $R^{2} \leq 1$ , so the upper bound becomes trivial when either score is $0.25$ .

Explanation:
We can decompose $r_{M Z} = S + V_{A} + R^{2}$ . We can decompose this further into $R^{2} = \sum_{i, j \geq 0, (i, j) \neq (1, 0)} V_{A^{i} D^{j}}$ . That is, we’re taking the nonlinear part and decomposing it into interactions involving alleles across $i$ loci and dominance effects in $j$ loci.
To understand dominance effects, note that a locus can have 0, 1, or 2 instances of an allele. The respective phenotypes resulting from these might not be produced by any linear function on alleles, because not every three points are colinear. The dominance term is the error resulting from a linear regression. If we were haploid, we wouldn’t have to deal with this.
So for example, $V_{A^{5} D^{3}}$ refers to phenotype effects that only appear when there is a specific combination of two alleles at three separate loci, and are multilinear in the alleles occurring at five other loci.
Interpreting this in context, $r_{D Z} = S_{D Z} + \frac{1}{2} V_{A} + \sum_{i, j \geq 0, (i, j) \neq (1, 0)} 2^{- i - 2 j} V_{A^{i} D^{j}}$ .
Now we can justify our conclusions. Note that the third term is at most $\frac{1}{4} R^{2}$ but has no lower bound. When we have to write it out, we’ll call it $X$ .
Remember that $R^{2}$ is exactly the proportion of variance due to genes that cannot be captured by a polygenic score, the “phantom heritability.” The paper is concerned with how substantial $R^{2}$ means $V_{A} < 1$ , so that if the polygenic score is close to $V_{A}$ people will assume there is missing heritability when it reality the polygenic score is perfect and the heritability is simply nonlinear.

The ACE Estimate:
$2 (r_{M Z} - r_{D Z}) = V_{A} + \sum_{i, j \geq 0, (i, j) \neq (1, 0)} (2 - 2^{- i - 2 j + 1}) V_{A^{i} D^{j}}$ . This disagrees with the figure in the appendix of the paper. I believe they made an arithmetic error, but it is possible I made a conceptual error.
Recalling $V_{A} = 1 - R^{2}$ , $r_{M Z} - r_{D Z} - \frac{1}{2} = \sum_{i, j \geq 0, (i, j) \neq (1, 0)} (\frac{1}{2} - 2^{- i - 2 j}) V_{A^{i} D^{j}}$ . Those constant terms in the sum go as low as $\frac{1}{4}$ and arbitrarily close to $\frac{1}{2}$ , so by taking the bounds and dividing we recover (1).

The Rule of Thumb:
$\frac{r_{D Z}}{r_{M Z}} = \frac{S + \frac{1}{2} V_{A} + X^{2}}{S + V_{A} + R^{2}} = \frac{S}{S + 1} + \frac{1}{2 (S + 1)} (1 - R^{2}) + \frac{1}{S + 1} X$ . Remember the bounds on $X$ , we can write $X = α R^{2}$ , where $α \in (0, \frac{1}{4}]$ . Combining this with the middle $- R^{2}$ term we have $\frac{1}{2} + \frac{S}{2 S + 1} - α R^{2}$ where $α \in [\frac{1}{4 (S + 1)}, \frac{1}{2 (S + 1)}) .$ Doing the arithmetic
$R^{2} \in (2 (S + 1) (\frac{1}{2} + \frac{S}{2 S + 1} - \frac{r_{D Z}}{r_{M Z}}), 4 (S + 1) (\frac{1}{2} + \frac{S}{2 S + 1} - \frac{r_{D Z}}{r_{M Z}})]$ . Picking $S = 0$ yields (2).

Comments:

I don’t yet rigorously understand how $R^{2}$ is decomposed into epistasis and dominance. The book gives only an intuition and not a proof. It is very ad hoc.
Edit: As of yesterday, I now understand.