My current best guess median is that we’ll see 6 OOMs of effective compute in the first year after full automation of AI R&D if this occurs in ~2029 using a 1e29 training run and compute is scaled up by a factor of 3.5x[1] over the course of this year[2]. This is around 5 years of progress at the current rate[3].
How big of a deal is 6 OOMs? I think it’s a pretty big deal; I have a draft post discussing how much an OOM gets you (on top of full automation of AI R&D) that I should put out somewhat soon.
Further, my distribution over this is radically uncertain with a 25th percentile of 2.5 OOMs (2 years of progress) and a 75th percentile of 12 OOMs.
The short breakdown of the key claims is:
Initial progress will be fast, perhaps ~15x faster algorithmic progress than humans.
Progress will probably speed up before slowing down due to training smarter AIs that can accelerate progress even faster, and this being faster than returns diminish on software.
We’ll be quite far from the limits of software progress (perhaps median 12 OOMs) at the point when we first achieve full automation.
Here is a somewhat summarized and rough version of the argument (stealing heavily from some of Tom Davidson’s forthcoming work):
At the point of full automation, progress will be fast:
Probably you’ll have lots of parallel workers running pretty fast at the point when you have full automation or shortly after this. This isn’t totally obvious due to inference compute, but prices often drop fast.
My guess is you’ll have enough compute that if you use 1⁄6 of your compute running AIs, you’ll be able to run the equivalent of ~1 million AIs which are roughly as good as the best human research scientists+engineers (taking into account cost reductions for using weaker models for many tasks). This is attempting to account for a reduction in the number of models due to using a bunch of inference compute. You’ll be able to run these AIs at the equivalent of 60x speed (3x from hours, 5x from direct speed, 2x from coordination, and 2x from variable time compute and/or context swapping with a cheaper+faster model). So, like 15k parallel copies at 60x speed.
Probably the AI company has like ~3k researchers, but when you adjust for quality, this is only as good as like 600 of the top engineers/researchers.
Let’s say marginal returns to parallelism are roughly 0.55.
Then, the increase in “serial labor equivalents” is roughly (15k / 600)^0.55 * 60 = 350. (Note that most of this is from speed and quality rather than parallel copies!)
Production of algorithmic research is due to both compute and labor. So, let’s say Cobb-Douglas with labor^0.5 compute^0.5 (My guess is that current marginal returns are more like labor^0.6 compute^0.4, but returns will get worse as you add more labor.) So, we do 350^0.5 = 19x which roughly matches my 15x speed up median.
I expect progress to slow as you hit limits or run out of progress doable with cheap experiments, but limits are probably pretty high as discussed in the next bullet.
Limits are high:
Human brain is 1e24 training flop.
We’re using 1e29 flop to get a bit over human level, so like 4 OOMs of headroom from this.
We can probably get a lot more efficiency, maybe 9 OOMs, at least for efficiency up rather than down. (By efficiency up rather than down I mean: we can do as well as using 1e33 flop with 1e24 real flop relative to scaling at the point when you hit human efficiency, but probably can’t train a human-level AI for 1e15 real flop.)
At some point, I’ll write a post that makes a better version of this argument and presents a full version of my picture.
I don’t think we’ll see a speed criticality per se; rather, I expect the rate of progress to accelerate up to the point of full automation. But I currently don’t think this makes a huge difference to the bottom line of “progress in the first year after full automation in practice”, as I expect to initially see fast cost decreases and inference time compute can only go so far. I could expand this argument to the extent you have cruxes like “slower takeoff because we’ve already eaten low-hanging fruit with earlier AI acceleration” and “inference compute means you hit full automation much faster”.
That is, after the first company fully automates AI R&D internally, if they decide to go as fast as possible and their AIs/employees/others don’t try to sabotage these efforts. And I’m assuming that AI software progress hasn’t substantially slowed down by the time of full automation, though conditioning on a 1e29 training run means that at least compute scaling progress (which is a key driver of software progress) hasn’t slowed down all that much.
This seems to make the simplifying assumption that the R&D automation is applied to a large fraction of all the compute that was previously driving algorithmic progress right?
If we imagine that a company only owns 10% of the compute being used to drive algorithmic progress pre-automation (and is only responsible for say 30% of its own algorithmic progress, with the rest coming from other labs/academia/open-source), and this company is the only one automating their AI R&D, then the effect on overall progress might be reduced (the 15X multiplier only applies to 30% of the relevant algorithmic progress).
In practice I would guess that either the leading actor has enough of a lead that they are already responsible for most of their algorithmic progress, or other groups are close behind and will thus automate their own AI R&D around the same time anyway. But I could imagine this slowing down the impact of initial AI R&D automation a little bit (and it might make a big difference for questions like “how much would it accelerate a non-frontier lab that stole the model weights and tried to do rsi”).
Yes, I think frontier AI companies are responsible for most of the algorithmic progress. I think its unclear how much the leading actor benefits from progress done at other slightly behind AI companies and this could make progress substantially slower. (However, it’s possible the leading AI company would be able to acquire the GPUs from these other companies.)
My current best guess median is that we’ll see 6 OOMs of effective compute in the first year after full automation of AI R&D if this occurs in ~2029 using a 1e29 training run and compute is scaled up by a factor of 3.5x[1] over the course of this year[2]. This is around 5 years of progress at the current rate[3].
How big of a deal is 6 OOMs? I think it’s a pretty big deal; I have a draft post discussing how much an OOM gets you (on top of full automation of AI R&D) that I should put out somewhat soon.
Further, my distribution over this is radically uncertain with a 25th percentile of 2.5 OOMs (2 years of progress) and a 75th percentile of 12 OOMs.
The short breakdown of the key claims is:
Initial progress will be fast, perhaps ~15x faster algorithmic progress than humans.
Progress will probably speed up before slowing down due to training smarter AIs that can accelerate progress even faster, and this being faster than returns diminish on software.
We’ll be quite far from the limits of software progress (perhaps median 12 OOMs) at the point when we first achieve full automation.
Here is a somewhat summarized and rough version of the argument (stealing heavily from some of Tom Davidson’s forthcoming work):
At the point of full automation, progress will be fast:
Probably you’ll have lots of parallel workers running pretty fast at the point when you have full automation or shortly after this. This isn’t totally obvious due to inference compute, but prices often drop fast.
My guess is you’ll have enough compute that if you use 1⁄6 of your compute running AIs, you’ll be able to run the equivalent of ~1 million AIs which are roughly as good as the best human research scientists+engineers (taking into account cost reductions for using weaker models for many tasks). This is attempting to account for a reduction in the number of models due to using a bunch of inference compute. You’ll be able to run these AIs at the equivalent of 60x speed (3x from hours, 5x from direct speed, 2x from coordination, and 2x from variable time compute and/or context swapping with a cheaper+faster model). So, like 15k parallel copies at 60x speed.
Probably the AI company has like ~3k researchers, but when you adjust for quality, this is only as good as like 600 of the top engineers/researchers.
Let’s say marginal returns to parallelism are roughly 0.55.
Then, the increase in “serial labor equivalents” is roughly (15k / 600)^0.55 * 60 = 350. (Note that most of this is from speed and quality rather than parallel copies!)
Production of algorithmic research is due to both compute and labor. So, let’s say Cobb-Douglas with labor^0.5 compute^0.5 (My guess is that current marginal returns are more like labor^0.6 compute^0.4, but returns will get worse as you add more labor.) So, we do 350^0.5 = 19x which roughly matches my 15x speed up median.
Progress will speed up before slowing down:
See discussion here
I expect progress to slow as you hit limits or run out of progress doable with cheap experiments, but limits are probably pretty high as discussed in the next bullet.
Limits are high:
Human brain is 1e24 training flop.
We’re using 1e29 flop to get a bit over human level, so like 4 OOMs of headroom from this.
We can probably get a lot more efficiency, maybe 9 OOMs, at least for efficiency up rather than down. (By efficiency up rather than down I mean: we can do as well as using 1e33 flop with 1e24 real flop relative to scaling at the point when you hit human efficiency, but probably can’t train a human-level AI for 1e15 real flop.)
At some point, I’ll write a post that makes a better version of this argument and presents a full version of my picture.
I don’t think we’ll see a speed criticality per se; rather, I expect the rate of progress to accelerate up to the point of full automation. But I currently don’t think this makes a huge difference to the bottom line of “progress in the first year after full automation in practice”, as I expect to initially see fast cost decreases and inference time compute can only go so far. I could expand this argument to the extent you have cruxes like “slower takeoff because we’ve already eaten low-hanging fruit with earlier AI acceleration” and “inference compute means you hit full automation much faster”.
3.5x is roughly the rate of bare metal compute scale-up per year.
That is, after the first company fully automates AI R&D internally, if they decide to go as fast as possible and their AIs/employees/others don’t try to sabotage these efforts. And I’m assuming that AI software progress hasn’t substantially slowed down by the time of full automation, though conditioning on a 1e29 training run means that at least compute scaling progress (which is a key driver of software progress) hasn’t slowed down all that much.
I think we see about 1.2 OOMs per year including both hardware and software.
Thanks for these thoughtful predictions. Do you think there’s anything we can do today to prepare for accelerated or automated AI research?
Maybe distracting technicality:
This seems to make the simplifying assumption that the R&D automation is applied to a large fraction of all the compute that was previously driving algorithmic progress right?
If we imagine that a company only owns 10% of the compute being used to drive algorithmic progress pre-automation (and is only responsible for say 30% of its own algorithmic progress, with the rest coming from other labs/academia/open-source), and this company is the only one automating their AI R&D, then the effect on overall progress might be reduced (the 15X multiplier only applies to 30% of the relevant algorithmic progress).
In practice I would guess that either the leading actor has enough of a lead that they are already responsible for most of their algorithmic progress, or other groups are close behind and will thus automate their own AI R&D around the same time anyway. But I could imagine this slowing down the impact of initial AI R&D automation a little bit (and it might make a big difference for questions like “how much would it accelerate a non-frontier lab that stole the model weights and tried to do rsi”).
Yes, I think frontier AI companies are responsible for most of the algorithmic progress. I think its unclear how much the leading actor benefits from progress done at other slightly behind AI companies and this could make progress substantially slower. (However, it’s possible the leading AI company would be able to acquire the GPUs from these other companies.)