There are two specific shoes that are yet to drop, and only one of them is legibly timed, AI-2027 simply takes the premise that both fall in 2027. There’s funding-fueled frontier AI training system scaling that will run out (absent AGI) in 2027-2029, which at the model level will be felt by 2028-2030. And there’s test-time adaptation/learning/training that accrues experience/on-boarding/memory of a given AI instance in the form of weight updates rather than observable-by-the-model artifacts outside the model.
So the currently-much-faster funding-fueled scaling has an expiration date, and if it doesn’t push capabilities past critical thresholds, then absence of this event is the reason to expect longer timelines. But test-time learning first needs to actually happen to be observed as by itself non-transformative, and as a basic research puzzle it might take an unknown number of years (again, absent AGI from scaling without test-time learning). If everyone is currently working on it and it fails to materialize by 2028, that’s also some sort of indication that it might take nontrivial time.
On a recent podcast, Dwarkesh Patel says that Sutskever’s SSI is rumored to be working on “test time training” (at 39:25). Another reason to think this “unhobbling” is plausible soon is that it might turn out to be possible to use agentic (tool-using) RLVR to train AIs to prepare datasets for finetuning variants of themselves (not necessarily with RLVR) that will then do better at particular tasks.
There are two specific shoes that are yet to drop, and only one of them is legibly timed, AI-2027 simply takes the premise that both fall in 2027. There’s funding-fueled frontier AI training system scaling that will run out (absent AGI) in 2027-2029, which at the model level will be felt by 2028-2030. And there’s test-time adaptation/learning/training that accrues experience/on-boarding/memory of a given AI instance in the form of weight updates rather than observable-by-the-model artifacts outside the model.
So the currently-much-faster funding-fueled scaling has an expiration date, and if it doesn’t push capabilities past critical thresholds, then absence of this event is the reason to expect longer timelines. But test-time learning first needs to actually happen to be observed as by itself non-transformative, and as a basic research puzzle it might take an unknown number of years (again, absent AGI from scaling without test-time learning). If everyone is currently working on it and it fails to materialize by 2028, that’s also some sort of indication that it might take nontrivial time.
How many people are working on test-time learning? How feasible do you think it is?
From a new comment elsewhere:
This was a really helpful overview. Thank you.