GeneSmith comments on Statistical Challenges with Making Super IQ babies

GeneSmith 29 Mar 2025 8:31 UTC
12 points
0
Sorry, I’ve been meaning to make an update on this for weeks now. We’re going to open source all the code we used to generate these graphs and do a full write-up of our methodology.

Kman can comment on some of the more intricate details of our methodology (he’s the one responsible for the graphs), but for now I’ll just say that there are aspects of direct vs indirect effects that we still don’t understand as well as we would like. In particular there are a few papers showing a negative correlation between direct and indirect effects in a way that is distinct for intellligence (i.e. you don’t see the same kind of negative correlation for educational attainment or height or anything like that). It’s not clear to us at this exact moment what’s actually causing those effects and why different papers disagree on the size of their impact.

In the latest versions of the IQ gain graph we’ve made three updates:
- We fixed a bug where we squared a term that should not have been squared (this resulted in a slight reduction in the effect size estimate)
- We now assume only ~82% of the effect alleles are direct, further reducing benefit. Our original estimate was based on a finding that the direct effects of IQ account for ~100% of the variance using the LDSC method. Based on the result of the Lee et al Educational Attainment 4 study, I think this was too optimistic.
- We now assume our predictor can explain more of the variance. This update was made after talking with one of the embryo selection companies and finding their predictor is much better than the publicly available predictor we were using
The net result is actually a noticeable increase in efficacy of editing for IQ. I think the gain went from ~50 to ~85 assuming 500 edits.

It’s a little frustrating to find that we made the two mistakes we did. But oh well; part of the reason to make stuff like this public is so others can point out mistakes in our modeling. I think in hindsight we should have done the traditional academic thing and ran the model by a few statistical geneticists before publishing. We only talked to one, and he didn’t get into enough depth for us to discover the issues we later discovered.
- Jan Christian Refsgaard 1 Apr 2025 17:40 UTC
  5 points
  0
  Parent
  I am glad that you guys fixed bugs and got stronger estimates.
  
  I suspect you fitted a model using best practices, I don’t think the methodology is my main critique, though I suspect there is insufficient shrinkage in your estimates (and most other published estimates for polygenic traits and diseases)
  
  It’s the extrapolations from the models I am skeptical of. There is a big difference between being able to predict within sample where by definition 95% of the data is between 70-130, and then assuming the model also correctly predict when you edit outside this range, for example your 85 upper bound IQ with 500 edits, if we did this to a baseline human with IQ 100, then his child would get an IQ of 185, which is so high that only 60 of the 8 billion people on planet earth is that smart if IQ was actually drawn from a unit normal with mean 100 and sigma 15, and if we got to 195 IQ by starting with a IQ 110 human, then he would have a 90% chance of being the smartest person alive, which I think is unlikely, and I find it unlikely because there could be interaction effects or a miss specified likelihood which makes a huge difference for the 2% of the data that is not between 70-130, but almost no difference for the other 98%, so you can not test what correctly likelihood is by conventional likelihood ratio testing, because you care about a region of the data that is unobserved.
  
  The second point is the distinction between causal for the association observed in the data, and causal when intervening on the genome, I suspect more than half of the gene is only causal for the association. I also imagine there are a lot of genes that are indirectly causal for IQ such as making you an attentive parent thus lowering the probability your kid does not sleep in the room with a lot of mold, which would not make the super baby smarter, but it would make the subsequent generation smarter.
  - GeneSmith 5 Apr 2025 0:47 UTC
    4 points
    0
    Parent
    So in theory I think we could probably validate IQ scores of up to 150-170 at most. I had a conversation with the guys from Riot IQ and they think that with larger sample sizes the tests can probably extrapolate out that far.
    We do have at least one example of a guy with a height +7 standard deviations above the mean actually showing up as a really extreme outlier due to additive genetic effects.
    The outlier here is Shawn Bradley, a former NBA player. Study here
    Granted, Shawn Bradley was chosen for this study because he is a very tall person who does not suffer from pituitary gland dysfunction that affects many of the tallest players. But that’s actually more analogous to what we’re trying to do with gene editing; increasing additive genetic variance to get outlier predispositions.
    
    I agree this is not enough evidence. I think there are some clever ways we can check how far additivity continues to hold outside of the normal distribution, such as checking the accuracy of predictors at different PGSes, and maybe some clever stuff in livestock.
    This is on our to-do list. We just haven’t had quite enough time to do it yet.
    The second point is the distinction between causal for the association observed in the data, and causal when intervening on the genome, I suspect more than half of the gene is only causal for the association. I also imagine there are a lot of genes that are indirectly causal for IQ such as making you an attentive parent thus lowering the probability your kid does not sleep in the room with a lot of mold, which would not make the super baby smarter, but it would make the subsequent generation smarter.
    There are some, but not THAT many. Estimates from EA4, the largest study on educational attainment to date, estimated the indirect effects for IQ at (I believe) about 18%. We accounted for that in the second version of the model.
    It’s possible that’s wrong. There is a frustratingly wide range of estimates for the indirect effect sizes for IQ in the literature. @kman can talk more about this, but I believe some of the studies showing larger indirect effects get such large numbers because they fail to account for the low test-retest reliability of the UK biobank fluid intelligence test.
    I think 0.18 is a reasonable estimate for the proportion of intelligence caused by indirect effects. But I’m open to evidence that our estimate is wrong.