I don’t know what they have in mind, and I agree the first obvious thing to do is just get more data and try linear models. But there’s plenty of reason to expect gains from nonlinear models, since broadsense heritability is higher than narrowsense, and due to missing heritability (though maybe it ought to be missing given our datasets), and due to theoretical reasons (quite plausibly there’s multiple factors, as straw!tailcalled has described, e.g. in an OR of ANDs circuit; and generally nonlinearities, e.g. U-shaped responses in latents like “how many neurons to grow”).
My guess, without knowing much, is that one of the first sorts of things to try is small circuits. A deep neural net (i.e. differentiable circuit) is a big circuit; it has many many hidden nodes (latent variables). A linear model is a tiny circuit: it has, say, one latent (the linear thing), maybe with a nonlinearity applied to that latent. (We’re not counting the input nodes.)
What about small but not tiny circuits? You could have, for example, a sum of ten lognormals, or a product of ten sums. At a guess, maybe this sort of thing
Captures substantially more of the structure of the trait, and so has in its hypothesis space predictors that are significantly better than any possible linear PGS;
is still pretty low complexity / doesn’t have vanishing gradients, or something—such that you can realistically learn given the fairly small datasets we have.
I don’t know what they have in mind, and I agree the first obvious thing to do is just get more data and try linear models. But there’s plenty of reason to expect gains from nonlinear models, since broadsense heritability is higher than narrowsense, and due to missing heritability (though maybe it ought to be missing given our datasets), and due to theoretical reasons (quite plausibly there’s multiple factors, as straw!tailcalled has described, e.g. in an OR of ANDs circuit; and generally nonlinearities, e.g. U-shaped responses in latents like “how many neurons to grow”).
My guess, without knowing much, is that one of the first sorts of things to try is small circuits. A deep neural net (i.e. differentiable circuit) is a big circuit; it has many many hidden nodes (latent variables). A linear model is a tiny circuit: it has, say, one latent (the linear thing), maybe with a nonlinearity applied to that latent. (We’re not counting the input nodes.)
What about small but not tiny circuits? You could have, for example, a sum of ten lognormals, or a product of ten sums. At a guess, maybe this sort of thing
Captures substantially more of the structure of the trait, and so has in its hypothesis space predictors that are significantly better than any possible linear PGS;
is still pretty low complexity / doesn’t have vanishing gradients, or something—such that you can realistically learn given the fairly small datasets we have.