The SNP itself is (usually) not causal Genotyping arrays select SNPs the genotype of which is correlated with a region around the SNP, they are said to be in linkage with this region as this region tends to be inherited together when recombination happens in meiosis. This is a matter of degree and linkage scores allow thresholds to be set for how indicative a SNP is about the genotype a given region.
This is taken into account by our models, and is why we see such large gains in editing power from increasing data set sizes: we’re better able to find the causal SNPs. Our editing strategy assumes that we’re largely hitting non-causal SNPs.
In practice epistatic interactions between QTLs matter for effects sizes and you cannot naively add up the effect sizes of all the QTLs for a trait and expect the result to reflect the real effect size, even if >50% effect are additive.
I’m not aware of any evidence for substantial effects of this sort on quantitative traits such as height. We’re also adding up expected effects, and as long as those estimates are unbiased the errors should cancel out as you do enough edits.
One thing we’re worried about is cases where the haplotypes have the small additive effects rather than individual SNPs, and you get an unpredictable (potentially deleterious) effect if you edit to a rare haplotype even if all SNPs involved are common. Are you aware of any evidence suggesting this would be a problem?
Could you expand on what sense you have ‘taken this into account’ in your models? What are you expecting to achieve by editing non-causal SNPs?
The first paper I linked is about epistasic effects on the additivity of a QTLs for quantitative trait, specifically heading date in rice, so this is evidence for this sort of effect on such a trait.
The general problem is without a robust causal understanding of what an edit does it is very hard to predict what shorts of problem might arise from novel combinations of variants in a haplotype. That’s just the nature of complex systems, a single incorrect base in the wrong place may have no effect or cause a critical cascading failure. You don’t know until you test it or have characterized the system so well you can graph out exactly what is going to happen. Just testing it in humans and seeing what happens is eventually going to hit something detrimental. When you are trying to do enhancement you tend to need a positive expectation that it will be safe not just no reason to think it won’t be. Many healthy people would be averse to risking good health for their kid, even at low probability of a bad outcome.
Could you expand on what sense you have ‘taken this into account’ in your models? What are you expecting to achieve by editing non-causal SNPs?
If we have a SNP that we’re 30% sure is causal, we expect to get 30% of its effect conditional on it being causal. Modulo any weird interaction stuff from rare haplotypes, which is a potential concern with this approach.
The first paper I linked is about epistasic effects on the additivity of a QTLs for quantitative trait, specifically heading date in rice, so this is evidence for this sort of effect on such a trait.
I didn’t read your first comment carefully enough; I’ll take a look at this.
I’m curious about the basis on which you are assigning a probability of causality without a method like mendelian randomisation, or something that tries to assign a probability of an effect based on interpreting the biology like a coding of the output of something like SnpEff to an approximate probability of effect.
The logic of 30% of its effect based on 30% chance it’s causal only seems like it will be pretty high variance and only work out over a pretty large number of edits. It is also assuming no unexpected effects of the edits to SNPs that are non-causal for whatever trait you are targeting but might do something else when edited.
I’m curious about the basis on which you are assigning a probability of causality without a method like mendelian randomisation, or something that tries to assign a probability of an effect based on interpreting the biology like a coding of the output of something like SnpEff to an approximate probability of effect.
Using finemapping. I.e. assuming a model where nonzero additive effects are sparsely distributed among SNPs, you can do Bayesian math to infer how probable each SNP is to have a nonzero effect and its expected effect size conditional on observed GWAS results. Things like SnpEff can further help by giving you a better prior.
One thing we’re worried about is cases where the haplotypes have the small additive effects rather than individual SNPs, and you get an unpredictable (potentially deleterious) effect if you edit to a rare haplotype even if all SNPs involved are common.
This is a point of uncertainty that bothered me when I was doing a similar analysis a while ago. GWAS data is possibly good enough to estimate causal effects of haplotypes, but that’s not enough information to do single base edits. To have reasonable confidence of getting the predicted effect, it’d be necessary to to make all the edits to transform the original haplotype into a different haplotype.
And unlike with distant variants where additive effects dominate, it’d make sense if non-additive effects are strong locally, since the variants are near each other. Whether this is actually true in reality is way beyond my knowledge, though.
This is taken into account by our models, and is why we see such large gains in editing power from increasing data set sizes: we’re better able to find the causal SNPs. Our editing strategy assumes that we’re largely hitting non-causal SNPs.
I’m not aware of any evidence for substantial effects of this sort on quantitative traits such as height. We’re also adding up expected effects, and as long as those estimates are unbiased the errors should cancel out as you do enough edits.
One thing we’re worried about is cases where the haplotypes have the small additive effects rather than individual SNPs, and you get an unpredictable (potentially deleterious) effect if you edit to a rare haplotype even if all SNPs involved are common. Are you aware of any evidence suggesting this would be a problem?
Could you expand on what sense you have ‘taken this into account’ in your models? What are you expecting to achieve by editing non-causal SNPs?
The first paper I linked is about epistasic effects on the additivity of a QTLs for quantitative trait, specifically heading date in rice, so this is evidence for this sort of effect on such a trait.
The general problem is without a robust causal understanding of what an edit does it is very hard to predict what shorts of problem might arise from novel combinations of variants in a haplotype. That’s just the nature of complex systems, a single incorrect base in the wrong place may have no effect or cause a critical cascading failure. You don’t know until you test it or have characterized the system so well you can graph out exactly what is going to happen. Just testing it in humans and seeing what happens is eventually going to hit something detrimental. When you are trying to do enhancement you tend to need a positive expectation that it will be safe not just no reason to think it won’t be. Many healthy people would be averse to risking good health for their kid, even at low probability of a bad outcome.
If we have a SNP that we’re 30% sure is causal, we expect to get 30% of its effect conditional on it being causal. Modulo any weird interaction stuff from rare haplotypes, which is a potential concern with this approach.
I didn’t read your first comment carefully enough; I’ll take a look at this.
Can you comment your current thoughts on rare haplotypes?
Don’t have much to say on it right now, I really need to do a deep dive into this at some point.
I’m curious about the basis on which you are assigning a probability of causality without a method like mendelian randomisation, or something that tries to assign a probability of an effect based on interpreting the biology like a coding of the output of something like SnpEff to an approximate probability of effect.
The logic of 30% of its effect based on 30% chance it’s causal only seems like it will be pretty high variance and only work out over a pretty large number of edits. It is also assuming no unexpected effects of the edits to SNPs that are non-causal for whatever trait you are targeting but might do something else when edited.
Using finemapping. I.e. assuming a model where nonzero additive effects are sparsely distributed among SNPs, you can do Bayesian math to infer how probable each SNP is to have a nonzero effect and its expected effect size conditional on observed GWAS results. Things like SnpEff can further help by giving you a better prior.
(For people reading this thread who want an intro to finemapping this lecture is a great place to start for a high level overview https://www.youtube.com/watch?v=pglYf7wocSI)
This is a point of uncertainty that bothered me when I was doing a similar analysis a while ago. GWAS data is possibly good enough to estimate causal effects of haplotypes, but that’s not enough information to do single base edits. To have reasonable confidence of getting the predicted effect, it’d be necessary to to make all the edits to transform the original haplotype into a different haplotype.
And unlike with distant variants where additive effects dominate, it’d make sense if non-additive effects are strong locally, since the variants are near each other. Whether this is actually true in reality is way beyond my knowledge, though.