My first thought was #2, that we overestimate the size of the IQ differences because we can only measure on the observed scale. But this doesn’t seem fully satisfactory. I know that connectivity is a very vogue concept and I don’t underestimate its importance, but I have recently been concerned that focusing on connectivity produces a concomitant overlooking of the importance of neuronal-intrinsic factors. One particular area of interest is synaptic cycling. I think about the importance of neuronal density and then consider how much could be gained by subtle additive genetic effects that lead to improved use/reuse of the same synapses. Without altering neuronal density at all, a 10% improvement in how quickly a synapse can form, a synaptic vesicle be repurposed, and a neuron be ready to fire again should effectively be tantamount to a ~10% gain in neuronal density. In other words, the architecture looks the same but performs at a substantially higher throughput.
Thinking about these as changes in hyperparameters is probably the closest analogy from a ML perspective. I should note that my own area of expertise is genetic epidemiology and neuroscience, not ML, so I am less fluent discussing the computational domain than human-adjacent biological structures. At the risk of speaking outside my depth, I offer the following from the perspective of a geneticist/neuroscientist: My intuition (FWIW) is that all human brains are largely running extremely similar models, and that the large IQ differences observed are either due to 1) inter-individual variability in neuronal performance (the cycling aspect I reference above), or 2) the number of parameters that can be quickly called from storage. The former seems analogous to two machines running the same software but with an underlying difference in hardware (eg, clock rate), while the latter seems more analogous to two machines running the same software but with vastly different levels of RAM. I can’t decide whether having better functionality at the level of individual neurons is more likely to generate benefit in the “clock rate” or the “RAM” domain. Both seem plausible, and again, my apologies for jettisoning LLM analogies for more historical ones drawn from the PC era. At least I didn’t say some folks were still running vacuum tubes instead of transistors!