In my mental model, I expect we are at least one fundamental breakthrough away from AGI (in keeping with François Chollet’s points about intelligence being about rapid efficient learning rather than just applying broad knowledge).
It seems difficult to me to predict how far we are from a breakthrough that gives us a significant improvement on this metric.
So, to me, it seems really important to ask how much a given level of LLM coding assistance is enabling researchers to iterate more easily over a broader range of experiments. I don’t have sufficient insight into the research patterns of AI companies to have a good sense of novel experiments per researcher per month (ERM). My expectation is that an increase in this metric ERM would give us some sense of how to update from the base rate of major conceptual breakthroughs (estimated in your article as 1 per 10 years, at 2010 − 2020 levels of researcher hours per year).
To figure out current odds of breakthrough per year, I’d want to know how many more researchers worldwide are working on ML than in the 2010-2020 period. I’d want to discount this by assuming many of the marginal additions are not as inspired and experienced as the earlier researchers, and are retreading known ground more, and are running less well designed experiments.
Then I’d want to make the upward adjustment of expected LLM assistance to ERM. Also, perhaps to being a sufficiently helpful research design assistant that it somewhat offsets the decrease to experiment quality caused by addition of many marginal researchers (by bumping up the low end, perhaps causing some that were below threshold of relevance to become potentially relevant).
If, over the next 5 years, we see a gradual average improvement to the “ML breakthrough rate”, we should expect the next breakthrough to arrive in more like 4-6 years rather than 8-12.
If a big enough breakthrough in “learning and extrapolation rate from limited data” (aka Chollet-style intelligence) does get found, I think that puts us on a fundamentally different trend line.
I also think there’s a third kind of “brain capability” axis that we might see a breakthrough along. I think it’s probably less impactful than the Chollet-style. This is what I’d call the “evolutionary programming”-style. In other words, the fact that evolution has managed to shape the genetics of a baby horse such that its fetal brain development was sufficient for it to walk soon after birth with nearly no learning. This seems different from either “inefficiently learned knowledge and heuristics” (aka current style) or from Chollet-style. So this third axis would be something like researchers being able to explicitly engineer a target skill into a model such that the model needed only a tiny amount of training on this targeted task to be competent.
I want to note a soft lower-bound here. If we compare the few-shot learning efficiency of the human brain (both in terms of watts expended, and number of samples) we note multiple orders of magnitude efficiency gain over current SotA AI learning. These algorithms were found by carbon-based life with a search algorithm that is essentially a random-walk and a satisficing ratchet mechanism called evolution. This means that there are probably far more learning algorithms and architectures lurking in the wings that we did not stumble upon (e.g. the eye vs the digital camera). The search algorithm we used to find those that power the brain, should return a loosely random sample of a much larger set. In contrast, the search methods humans are using to find intelligence algorithms are intentional and efficiency maximizing rather than satisficing. We are noticing improvement on the timescale of decades rather than millions of years as a result. Thus, as we continue to rapidly sample candidate architectures and algorithms, we should expect an explosion of capability as we search the candidate space, irrespective even of compute.
Note that if the “evolutionary programming” axis is real, and a breakthrough does occur there, it is possible that it might mean reaearchers could possibly “pre-program” a model to be good at rapid learning (aka increase its Chollet-style intelligence).
I suspect that something like this process of evolution predisposing a brain to be better at abstract learning is a key factor in the differentiation of humans from previous primates.
In my mental model, I expect we are at least one fundamental breakthrough away from AGI (in keeping with François Chollet’s points about intelligence being about rapid efficient learning rather than just applying broad knowledge).
It seems difficult to me to predict how far we are from a breakthrough that gives us a significant improvement on this metric.
So, to me, it seems really important to ask how much a given level of LLM coding assistance is enabling researchers to iterate more easily over a broader range of experiments. I don’t have sufficient insight into the research patterns of AI companies to have a good sense of novel experiments per researcher per month (ERM). My expectation is that an increase in this metric ERM would give us some sense of how to update from the base rate of major conceptual breakthroughs (estimated in your article as 1 per 10 years, at 2010 − 2020 levels of researcher hours per year).
To figure out current odds of breakthrough per year, I’d want to know how many more researchers worldwide are working on ML than in the 2010-2020 period. I’d want to discount this by assuming many of the marginal additions are not as inspired and experienced as the earlier researchers, and are retreading known ground more, and are running less well designed experiments.
Then I’d want to make the upward adjustment of expected LLM assistance to ERM. Also, perhaps to being a sufficiently helpful research design assistant that it somewhat offsets the decrease to experiment quality caused by addition of many marginal researchers (by bumping up the low end, perhaps causing some that were below threshold of relevance to become potentially relevant).
If, over the next 5 years, we see a gradual average improvement to the “ML breakthrough rate”, we should expect the next breakthrough to arrive in more like 4-6 years rather than 8-12.
If a big enough breakthrough in “learning and extrapolation rate from limited data” (aka Chollet-style intelligence) does get found, I think that puts us on a fundamentally different trend line.
I also think there’s a third kind of “brain capability” axis that we might see a breakthrough along. I think it’s probably less impactful than the Chollet-style. This is what I’d call the “evolutionary programming”-style. In other words, the fact that evolution has managed to shape the genetics of a baby horse such that its fetal brain development was sufficient for it to walk soon after birth with nearly no learning. This seems different from either “inefficiently learned knowledge and heuristics” (aka current style) or from Chollet-style. So this third axis would be something like researchers being able to explicitly engineer a target skill into a model such that the model needed only a tiny amount of training on this targeted task to be competent.
I want to note a soft lower-bound here. If we compare the few-shot learning efficiency of the human brain (both in terms of watts expended, and number of samples) we note multiple orders of magnitude efficiency gain over current SotA AI learning. These algorithms were found by carbon-based life with a search algorithm that is essentially a random-walk and a satisficing ratchet mechanism called evolution. This means that there are probably far more learning algorithms and architectures lurking in the wings that we did not stumble upon (e.g. the eye vs the digital camera). The search algorithm we used to find those that power the brain, should return a loosely random sample of a much larger set. In contrast, the search methods humans are using to find intelligence algorithms are intentional and efficiency maximizing rather than satisficing. We are noticing improvement on the timescale of decades rather than millions of years as a result. Thus, as we continue to rapidly sample candidate architectures and algorithms, we should expect an explosion of capability as we search the candidate space, irrespective even of compute.
tldr; Low hanging-fruit should be abundant.
Note that if the “evolutionary programming” axis is real, and a breakthrough does occur there, it is possible that it might mean reaearchers could possibly “pre-program” a model to be good at rapid learning (aka increase its Chollet-style intelligence).
I suspect that something like this process of evolution predisposing a brain to be better at abstract learning is a key factor in the differentiation of humans from previous primates.