We simulate AI progress after the deployment of ASARA.
We assume that half of recent AI progress comes from using more compute in AI development and the other half comes from improved software. (“Software” here refers to AI algorithms, data, fine-tuning, scaffolding, inference-time techniques like o1 — all the sources of AI progress other than additional compute.) We assume compute is constant and only simulate software progress.
We assume that software progress is driven by two inputs: 1) cognitive labour for designing better AI algorithms, and 2) compute for experiments to test new algorithms. Compute for experiments is assumed to be constant. Cognitive labour is proportional to the level of software, reflecting the fact AI has automated AI research.
Your definition of software includes all data, which strikes me as an unusual use of the term so I’ll put it in scare quotes.
You say half of recent AI progress came from “software” and half comes from compute. Then in your diagram, the cognitive labor gained from better AI is going to improve “software.”
To me it seems like a ton of recent AI progress was from using up a data overhang, in the sense of scaling up compute enough to take advantage of an existing wealth of data (the internet, most or all books, etc.)
I don’t see how more AI researchers, automated or not, could find more of this data. The model has their cognitive labor being used to increase “software.” Does the model assume that they are finding or generating more of this data, in addition to doing R&D for new algorithms, or other “software” bucket activities?
Yeah, i think one of the biggest weaknesses of this model, and honestly of most thinking on the intelligence explosion, is not carefully thinking through the data.
During an SIE, AIs will need to generate data themselves, by doing the things that human researchers currently do to generate data. That includes finding new untapped data sources, creating virtual envs, creating SFT data themselves by doing tasks with scaffolds, etc.
OTOH it seems unlikely they’ll have anything as easy as the internet to work with. OTOH, internet data is actually v poorly targeted at teaching AIs how to do crucial real-world tasks, so perhaps with abundant cognitive labour you can do much better and make curriculla that directly targeted the skills that most need improving
Your definition of software includes all data, which strikes me as an unusual use of the term so I’ll put it in scare quotes.
You say half of recent AI progress came from “software” and half comes from compute. Then in your diagram, the cognitive labor gained from better AI is going to improve “software.”
To me it seems like a ton of recent AI progress was from using up a data overhang, in the sense of scaling up compute enough to take advantage of an existing wealth of data (the internet, most or all books, etc.)
I don’t see how more AI researchers, automated or not, could find more of this data. The model has their cognitive labor being used to increase “software.” Does the model assume that they are finding or generating more of this data, in addition to doing R&D for new algorithms, or other “software” bucket activities?
Yeah, i think one of the biggest weaknesses of this model, and honestly of most thinking on the intelligence explosion, is not carefully thinking through the data.
During an SIE, AIs will need to generate data themselves, by doing the things that human researchers currently do to generate data. That includes finding new untapped data sources, creating virtual envs, creating SFT data themselves by doing tasks with scaffolds, etc.
OTOH it seems unlikely they’ll have anything as easy as the internet to work with. OTOH, internet data is actually v poorly targeted at teaching AIs how to do crucial real-world tasks, so perhaps with abundant cognitive labour you can do much better and make curriculla that directly targeted the skills that most need improving
Thanks Tom! Appreciate the clear response. This feels like it significantly limits how much I update on the model.