Fun fact: AI-2027 estimates that getting to ASI might take the equivalent of a 100-person team of top human AI research talent working for tens of thousands of years.
I’m curious why ASI would take so much work. What exactly is the R&D labor supposed to be doing each day, that adds up to so much effort? I’m curious how people are thinking about that, if they buy into this kind of picture. Thanks :)
(Calculation details: For example, in October 2027 of the AI-2027 modal scenario, they have “330K superhuman AI researcher copies thinking at 57x human speed”, which is 1.6 million person-years of research in that month alone. And that’s mostly going towards inventing ASI, I think. Did I get that right?)
(My own opinion, stated without justification, is that LLMs are not a paradigm that can scale to ASI, but after some future AI paradigm shift, there will be very very little R&D separating “this type of AI can do anything importantly useful at all” and “full-blown superintelligence”. Like maybe dozens or hundreds of person-years, or whatever, as opposed to millions. More on this in a (hopefully) forthcoming post.)
Whew, a critique that our takeoff should be faster for a change, as opposed to slower.
Fun fact: AI-2027 estimates that getting to ASI might take the equivalent of a 100-person team of top human AI research talent working for tens of thousands of years.
(Calculation details: For example, in October 2027 of the AI-2027 modal scenario, they have “330K superhuman AI researcher copies thinking at 57x human speed”, which is 1.6 million person-years of research in that month alone. And that’s mostly going towards inventing ASI, I think. Did I get that right?)
This depends on how large you think the penalty is for parallelized labor as opposed to serial. If 330k parallel researchers is more like equivalent to 100 researchers at 50x speed than 100 researchers at 3,300x speed, then it’s more like a team of 100 researchers working for (50*57)/12=~250 years.
Also of course to the extent you think compute will be an important input, during October they still just have a month’s worth of total compute even though they’re working for 250-25,000 subjective years.
I’m curious why ASI would take so much work. What exactly is the R&D labor supposed to be doing each day, that adds up to so much effort? I’m curious how people are thinking about that, if they buy into this kind of picture. Thanks :)
I’m imagining that there’s a mix of investing tons of effort into optimizing experimenting ideas, implementing and interpreting every experiment quickly, as well as tons of effort into more conceptual agendas given the compute shortage, some of which bear fruit but also involve lots of “wasted” effort exploring possible routes, and most of which end up needing significant experimentation as well to get working.
(My own opinion, stated without justification, is that LLMs are not a paradigm that can scale to ASI, but after some future AI paradigm shift, there will be very very little R&D separating “this type of AI can do anything importantly useful at all” and “full-blown superintelligence”. Like maybe dozens or hundreds of person-years, or whatever, as opposed to millions. More on this in a (hopefully) forthcoming post.)
I don’t share this intuition regarding the gap between the first importantly useful AI and ASI. If so, that implies extremely fast takeoff, correct? Like on the order of days from AI that can do important things to full-blown superintelligence?
Currently there are hundreds or perhaps low thousands of years of relevant research effort going into frontier AI each year. The gap between importantly useful AI and ASI seems larger than a year of current AI progress (though I’m not >90% confident in that, especially if timelines are <2 years). Then we also need to take into account diminishing returns, compute bottlenecks, and parallelization penalties, so my guess is that the required person-years should be at minimum in the thousands and likely much more. Overall the scenario you’re describing is maybe (roughly) my 95th percentile speed?
I’m curious about your definition for importantly useful AI actually. Under some interpretations I feel like current AI should cross that bar.
I’m uncertain about the LLMs thing but would lean toward pretty large shifts by the time of ASI; I think it’s more likely LLMs scale to superhuman coders than to ASI.
If we divide the inventing-ASI task into (A) “thinking about and writing algorithms” versus (B) “testing algorithms”, in the world of today there’s a clean division of labor where the humans do (A) and the computers do (B). But in your imagined October 2027 world, there’s fungibility between how much compute is being used on (A) versus (B). I guess I should interpret your “330K superhuman AI researcher copies thinking at 57x human speed” as what would happen if the compute hypothetically all went towards (A), none towards (B)? And really there’s gonna be some division of compute between (A) and (B), such that the amount of (A) is less than I claimed? …Or how are you thinking about that?
I’m curious about your definition for importantly useful AI actually. Under some interpretations I feel like current AI should cross that bar.
Right, but I’m positing a discontinuity between current AI and the next paradigm, and I was talking about the gap between when AI-of-that-next-paradigm is importantly useful versus when it’s ASI. For example, AI-of-that-next-paradigm might arguably already exist today but where it’s missing key pieces such that it barely works on toy models in obscure arxiv papers. Or here’s a more concrete example: Take the “RL agent” line of AI research (AlphaZero, MuZero, stuff like that), which is quite different from LLMs (e.g. “training environment” rather than “training data”, and there’s nothing quite like self-supervised pretraining (see here)). This line of research has led to great results on board games and videogames, but it’s more-or-less economically useless, and certainly useless for alignment research, societal resilience, capabilities research, etc. If it turns out that this line of research is actually much closer to how future ASI will work at a nuts-and-bolts level than LLMs are (for the sake of argument), then we have not yet crossed the “AI-of-that-next-paradigm is importantly useful” threshold in my sense.
If it helps, here’s a draft paragraph from that (hopefully) forthcoming post:
Another possible counter-argument from a prosaic-AGI person would be: “Maybe this future paradigm exists, but LLM agents will find it, not humans, so this is really part of that ‘AIs-doing-AI-R&D’ story like I’ve been saying”. I have two responses. First, I disagree with that prediction. Granted, probably LLMs will be a helpful research tool involved in finding the new paradigm, but there have always been helpful research tools, from PyTorch to arXiv to IDEs, and I don’t expect LLMs to be fundamentally different from those other helpful research tools. Second, even if it’s true that LLMs will discover the new paradigm by themselves (or almost by themselves), I’m just not sure I even care. I see the pre-paradigm-shift AI world as a lesser problem, one that LLM-focused AI alignment researchers (i.e. the vast majority of them) are already focusing on. Good luck to them. And I want to talk about what happens in the strange new world that we enter after that paradigm shift.
Next:
If so, that implies extremely fast takeoff, correct? Like on the order of days from AI that can do important things to full-blown superintelligence?
Well, even if you have an ML training plan that will yield ASI, you still need to run it, which isn’t instantaneous. I dunno, it’s something I’m still puzzling over.
…But yeah, many of my views are pretty retro, like a time capsule from like AI alignment discourse of 2009. ¯\_(ツ)_/¯
If we divide the inventing-ASI task into (A) “thinking about and writing algorithms” versus (B) “testing algorithms”, in the world of today there’s a clean division of labor where the humans do (A) and the computers do (B). But in your imagined October 2027 world, there’s fungibility between how much compute is being used on (A) versus (B). I guess I should interpret your “330K superhuman AI researcher copies thinking at 57x human speed” as what would happen if the compute hypothetically all went towards (A), none towards (B)? And really there’s gonna be some division of compute between (A) and (B), such that the amount of (A) is less than I claimed? …Or how are you thinking about that?
I’m not 100% sure what you mean, but my guess is that you mean (B) to represent the compute used for experiments? We do project a split here and the copies/speed numbers are just for (A). You can see our projections for the split in our compute forecast (we are not confident that they are roughly right).
Re: the rest of your comment, makes sense. Perhaps the place I most disagree is that if LLMs will be the thing discovering the new paradigm, they will probably also be useful for things like automating alignment research, epistemics, etc. Also if they are misaligned they could sabotage the research involved in the paradigm shift.
I can somewhat see where you’re coming from about a new method being orders of magnitude more data efficient in RL, but I very strongly bet on transformers being core even after such a paradigm shift. I’m curious whether you think the transformer architecture and text input/output need to go, or whether the new training procedure / architecture fits in with transformers because transformers are just the best information mixing architecture.
My guess the main issue of current transformers turns out to be the fact that they don’t have a long-term state/memory, and I think this is a pretty critical part of how humans are able to learn on the job as effectively as they do.
The trouble as I’ve heard it is the other approaches which incorporate a state/memory for the long-run are apparently much harder to train reasonably well than transformers, plus first-mover effects.
That does raise my eyebrows a bit, but also, note that we currently have hundreds of top-level researchers at AGI labs tirelessly working day in and day out, and that all that activity results in a… fairly leisurely pace of progress, actually.[1]
Recall that what they’re doing there is blind atheoretical empirical tinkering (tons of parallel experiments most of which are dead ends/eke out scant few bits of useful information). If you take that research paradigm and ramp it up to superhuman levels (without changing the fundamental nature of the work), maybe it really would take this many researcher-years.
And if AI R&D automation is actually achieved on the back of sleepwalking LLMs, that scenario does seem plausible. These superhuman AI researchers wouldn’t actually be generally superhuman researchers, just superhuman at all the tasks in the blind-empirical-tinkering research paradigm. Which has steeply declining returns to more intelligence added.
That said, yeah, if LLMs actually scale to a “lucid” AGI, capable of pivoting to paradigms with better capability returns on intelligent work invested, I expect it to take dramatically less time.
It’s fast if you use past AI progress as the reference class, but is decidedly not fast if you try to estimate “absolute” progress. Like, this isn’t happening, we’ve jumped to near human-baseline and slowed to a crawl at this level. If we assume the human level is the ground and we’re trying to reach the Sun, it in fact might take millennia at this pace.
we’ve jumped to near human-baseline and slowed to a crawl at this level
A possible reason for that might be the fallibility of our benchmarks. It might be the case that for complex tasks, it’s hard for humans to see farther than their nose.
The short version is getting compute-optimal experiments to self-improve yourself, training to do tasks that unavoidably take a really long time to learn/get data on because of real-world experimentation being necessary, combined with a potential hardware bottleneck on robotics that also requires real-life experimentation to overcome.
Another point is that to the extent you buy the scaling hypothesis at all, then compute bottlenecks will start to bite, and given that researchers will seek small constant improvements they don’t generalize, and this can start a cascade of wrong decisions that could take a very long time to get out of.
(My own opinion, stated without justification, is that LLMs are not a paradigm that can scale to ASI, but after some future AI paradigm shift, there will be very very little R&D separating “this type of AI can do anything importantly useful at all” and “full-blown superintelligence”. Like maybe dozens or hundreds of person-years, or whatever, as opposed to millions. More on this in a (hopefully) forthcoming post.)
I’d like to see that post, and I’d like to see your arguments on why it’s so easy for intelligence to be increased so fast, conditional on a new paradigm shift.
(For what it’s worth, I personally think LLMs might not be the last paradigm, because of their current lack of continuous learning/neuroplasticity plus no long term memory/state, but I don’t expect future paradigms to have an AlphaZero like trajectory curve, where things go from zero to wildly superhuman in days/weeks, though I do think takeoff is faster if we condition on a new paradigm being required for ASI, so I do see the AGI transition to plausibly include having only months until we get superintelligence, and maybe only 1-2 years before superintelligence starts having very, very large physical impacts through robotics, assuming that new paradigms are developed, so I’m closer to hundreds of person years/thousands of person years than dozens of person years).
The world is complicated (see: I, Pencil). You can be superhuman by only being excellent at a few fields, for example politics, persuasion, military, hacking. That still leaves you potentially vulnerable, even if your opponents are unlikely to succeed; or you could hurt yourself by your ignorance in some field. Or you can be superhuman in the sense of being able to make the pencil from scratch, only better at each step. That would probably take more time.
Are you suggesting that e.g. “R&D Person-Years 463205–463283 go towards ensuring that the AI has mastery of metallurgy, and R&D Person-Years 463283–463307 go towards ensuring that the AI has mastery of injection-molding machinery, and …”?
If no, then I don’t understand what “the world is complicated” has to do with “it takes a million person-years of R&D to build ASI”. Can you explain?
…Or if yes, that kind of picture seems to contradict the facts that:
This seems quite disanalogous to how LLMs are designed today (i.e., LLMs can already answer any textbook question about injection-molding machinery, but no human doing LLM R&D has ever worked specifically on LLM knowledge of injection-molding machinery),
This seems quite disanalogous to how the human brain was designed (i.e., humans are human-level at injection-molding machinery knowledge and operation, but Evolution designed human brains for the African Savannah, which lacked any injection-molding machinery).
LLMs quickly acquired the capacity to read what humans wrote and paraphrase it. It is not obvious to me (though that may speak more about my ignorance) that it will be similarly easy to acquire deep understanding of everything.
Incidentally, is there any meaningful sense in which we can say how many “person-years of thought” LLMs have already done?
We know they can do things in seconds that would take a human minutes. Does that mean those real-time seconds count as “human-minutes” of thought? Etc.
Fun fact: AI-2027 estimates that getting to ASI might take the equivalent of a 100-person team of top human AI research talent working for tens of thousands of years.
I’m curious why ASI would take so much work. What exactly is the R&D labor supposed to be doing each day, that adds up to so much effort? I’m curious how people are thinking about that, if they buy into this kind of picture. Thanks :)
(Calculation details: For example, in October 2027 of the AI-2027 modal scenario, they have “330K superhuman AI researcher copies thinking at 57x human speed”, which is 1.6 million person-years of research in that month alone. And that’s mostly going towards inventing ASI, I think. Did I get that right?)
(My own opinion, stated without justification, is that LLMs are not a paradigm that can scale to ASI, but after some future AI paradigm shift, there will be very very little R&D separating “this type of AI can do anything importantly useful at all” and “full-blown superintelligence”. Like maybe dozens or hundreds of person-years, or whatever, as opposed to millions. More on this in a (hopefully) forthcoming post.)
Whew, a critique that our takeoff should be faster for a change, as opposed to slower.
This depends on how large you think the penalty is for parallelized labor as opposed to serial. If 330k parallel researchers is more like equivalent to 100 researchers at 50x speed than 100 researchers at 3,300x speed, then it’s more like a team of 100 researchers working for (50*57)/12=~250 years.
Also of course to the extent you think compute will be an important input, during October they still just have a month’s worth of total compute even though they’re working for 250-25,000 subjective years.
I’m imagining that there’s a mix of investing tons of effort into optimizing experimenting ideas, implementing and interpreting every experiment quickly, as well as tons of effort into more conceptual agendas given the compute shortage, some of which bear fruit but also involve lots of “wasted” effort exploring possible routes, and most of which end up needing significant experimentation as well to get working.
I don’t share this intuition regarding the gap between the first importantly useful AI and ASI. If so, that implies extremely fast takeoff, correct? Like on the order of days from AI that can do important things to full-blown superintelligence?
Currently there are hundreds or perhaps low thousands of years of relevant research effort going into frontier AI each year. The gap between importantly useful AI and ASI seems larger than a year of current AI progress (though I’m not >90% confident in that, especially if timelines are <2 years). Then we also need to take into account diminishing returns, compute bottlenecks, and parallelization penalties, so my guess is that the required person-years should be at minimum in the thousands and likely much more. Overall the scenario you’re describing is maybe (roughly) my 95th percentile speed?
I’m curious about your definition for importantly useful AI actually. Under some interpretations I feel like current AI should cross that bar.
I’m uncertain about the LLMs thing but would lean toward pretty large shifts by the time of ASI; I think it’s more likely LLMs scale to superhuman coders than to ASI.
Thanks, that’s very helpful!
If we divide the inventing-ASI task into (A) “thinking about and writing algorithms” versus (B) “testing algorithms”, in the world of today there’s a clean division of labor where the humans do (A) and the computers do (B). But in your imagined October 2027 world, there’s fungibility between how much compute is being used on (A) versus (B). I guess I should interpret your “330K superhuman AI researcher copies thinking at 57x human speed” as what would happen if the compute hypothetically all went towards (A), none towards (B)? And really there’s gonna be some division of compute between (A) and (B), such that the amount of (A) is less than I claimed? …Or how are you thinking about that?
Right, but I’m positing a discontinuity between current AI and the next paradigm, and I was talking about the gap between when AI-of-that-next-paradigm is importantly useful versus when it’s ASI. For example, AI-of-that-next-paradigm might arguably already exist today but where it’s missing key pieces such that it barely works on toy models in obscure arxiv papers. Or here’s a more concrete example: Take the “RL agent” line of AI research (AlphaZero, MuZero, stuff like that), which is quite different from LLMs (e.g. “training environment” rather than “training data”, and there’s nothing quite like self-supervised pretraining (see here)). This line of research has led to great results on board games and videogames, but it’s more-or-less economically useless, and certainly useless for alignment research, societal resilience, capabilities research, etc. If it turns out that this line of research is actually much closer to how future ASI will work at a nuts-and-bolts level than LLMs are (for the sake of argument), then we have not yet crossed the “AI-of-that-next-paradigm is importantly useful” threshold in my sense.
If it helps, here’s a draft paragraph from that (hopefully) forthcoming post:
Next:
Well, even if you have an ML training plan that will yield ASI, you still need to run it, which isn’t instantaneous. I dunno, it’s something I’m still puzzling over.
…But yeah, many of my views are pretty retro, like a time capsule from like AI alignment discourse of 2009. ¯\_(ツ)_/¯
Sorry for the late reply.
I’m not 100% sure what you mean, but my guess is that you mean (B) to represent the compute used for experiments? We do project a split here and the copies/speed numbers are just for (A). You can see our projections for the split in our compute forecast (we are not confident that they are roughly right).
Re: the rest of your comment, makes sense. Perhaps the place I most disagree is that if LLMs will be the thing discovering the new paradigm, they will probably also be useful for things like automating alignment research, epistemics, etc. Also if they are misaligned they could sabotage the research involved in the paradigm shift.
I can somewhat see where you’re coming from about a new method being orders of magnitude more data efficient in RL, but I very strongly bet on transformers being core even after such a paradigm shift. I’m curious whether you think the transformer architecture and text input/output need to go, or whether the new training procedure / architecture fits in with transformers because transformers are just the best information mixing architecture.
My guess the main issue of current transformers turns out to be the fact that they don’t have a long-term state/memory, and I think this is a pretty critical part of how humans are able to learn on the job as effectively as they do.
The trouble as I’ve heard it is the other approaches which incorporate a state/memory for the long-run are apparently much harder to train reasonably well than transformers, plus first-mover effects.
That does raise my eyebrows a bit, but also, note that we currently have hundreds of top-level researchers at AGI labs tirelessly working day in and day out, and that all that activity results in a… fairly leisurely pace of progress, actually.[1]
Recall that what they’re doing there is blind atheoretical empirical tinkering (tons of parallel experiments most of which are dead ends/eke out scant few bits of useful information). If you take that research paradigm and ramp it up to superhuman levels (without changing the fundamental nature of the work), maybe it really would take this many researcher-years.
And if AI R&D automation is actually achieved on the back of sleepwalking LLMs, that scenario does seem plausible. These superhuman AI researchers wouldn’t actually be generally superhuman researchers, just superhuman at all the tasks in the blind-empirical-tinkering research paradigm. Which has steeply declining returns to more intelligence added.
That said, yeah, if LLMs actually scale to a “lucid” AGI, capable of pivoting to paradigms with better capability returns on intelligent work invested, I expect it to take dramatically less time.
It’s fast if you use past AI progress as the reference class, but is decidedly not fast if you try to estimate “absolute” progress. Like, this isn’t happening, we’ve jumped to near human-baseline and slowed to a crawl at this level. If we assume the human level is the ground and we’re trying to reach the Sun, it in fact might take millennia at this pace.
A possible reason for that might be the fallibility of our benchmarks. It might be the case that for complex tasks, it’s hard for humans to see farther than their nose.
The short version is getting compute-optimal experiments to self-improve yourself, training to do tasks that unavoidably take a really long time to learn/get data on because of real-world experimentation being necessary, combined with a potential hardware bottleneck on robotics that also requires real-life experimentation to overcome.
Another point is that to the extent you buy the scaling hypothesis at all, then compute bottlenecks will start to bite, and given that researchers will seek small constant improvements they don’t generalize, and this can start a cascade of wrong decisions that could take a very long time to get out of.
I’d like to see that post, and I’d like to see your arguments on why it’s so easy for intelligence to be increased so fast, conditional on a new paradigm shift.
(For what it’s worth, I personally think LLMs might not be the last paradigm, because of their current lack of continuous learning/neuroplasticity plus no long term memory/state, but I don’t expect future paradigms to have an AlphaZero like trajectory curve, where things go from zero to wildly superhuman in days/weeks, though I do think takeoff is faster if we condition on a new paradigm being required for ASI, so I do see the AGI transition to plausibly include having only months until we get superintelligence, and maybe only 1-2 years before superintelligence starts having very, very large physical impacts through robotics, assuming that new paradigms are developed, so I’m closer to hundreds of person years/thousands of person years than dozens of person years).
The world is complicated (see: I, Pencil). You can be superhuman by only being excellent at a few fields, for example politics, persuasion, military, hacking. That still leaves you potentially vulnerable, even if your opponents are unlikely to succeed; or you could hurt yourself by your ignorance in some field. Or you can be superhuman in the sense of being able to make the pencil from scratch, only better at each step. That would probably take more time.
Are you suggesting that e.g. “R&D Person-Years 463205–463283 go towards ensuring that the AI has mastery of metallurgy, and R&D Person-Years 463283–463307 go towards ensuring that the AI has mastery of injection-molding machinery, and …”?
If no, then I don’t understand what “the world is complicated” has to do with “it takes a million person-years of R&D to build ASI”. Can you explain?
…Or if yes, that kind of picture seems to contradict the facts that:
This seems quite disanalogous to how LLMs are designed today (i.e., LLMs can already answer any textbook question about injection-molding machinery, but no human doing LLM R&D has ever worked specifically on LLM knowledge of injection-molding machinery),
This seems quite disanalogous to how the human brain was designed (i.e., humans are human-level at injection-molding machinery knowledge and operation, but Evolution designed human brains for the African Savannah, which lacked any injection-molding machinery).
Yes, I meant it that way.
LLMs quickly acquired the capacity to read what humans wrote and paraphrase it. It is not obvious to me (though that may speak more about my ignorance) that it will be similarly easy to acquire deep understanding of everything.
But maybe it will. I don’t know.
Incidentally, is there any meaningful sense in which we can say how many “person-years of thought” LLMs have already done?
We know they can do things in seconds that would take a human minutes. Does that mean those real-time seconds count as “human-minutes” of thought? Etc.