Conservative take. Compute will keep growing at its current rate. Its current rate contributes 0.5 years of progress per year—because ~half of AI progress is driven by compute. So including compute growth will increase the “number of years of progress compressed into 1 year” by 0.5. This is a pretty small effect as it would just move us from (e.g.) 4 years to 4.5 years.
More aggressive take. We should also model the interaction effects. Growing compute will itself speed up software progress. How big is this effect?
If r = 0.5, then (in equilibrium) each doubling of compute drives one doubling of software. So then rather than an extra 0.5 years this is an extra 1 year.
If r >= 1 then progress is accelerating anyway. The extra compute means it will accelerate faster. How much faster? Well having 3X the compute might convert to 3X more training compute and 3X more runtime compute. My model implies that 10X more training compute gives you a 4X speed-up. (Which may be aggressive.) So 3X more training compute would be a 2X speed-up. Could bump to 3X speed-up due to the additional runtime compute. So overall this would make the slower tail-end of the IE happen 2-3X faster. So you cram more progress into 1 year before things slow down. Would need to model to know how much, but maybe the r = 0.5 estimate from above is about right (if the IE slows down when r = 0.5). So again this would be pointing to an extra year.
We do simulate this! We have a ‘gradual boost’ model variant in which compute grows exponentially in the background. Compute is ‘meant’ to stop growing when you have ASARA in the model, but you could simply choose a higher “initial boost” and you’ll be effectively simulating compute growing after ASARA. Just pick an above-ASARA capability threshold and choose your initial boost to match that.
I was surprised when I ran the simulations of this variant that there were notably more aggressive when I adjusted other params to fit the fact that the sim starts before ASARA.
I do think this model is probably the best one. It can simulate a gradual approach to ASARA and compute growth during the SIE. Probably is worth doing more sensitivity analyses on its behaviour. You can do that in the online tool!
I notice I am confused. I generally think that after full automation of AI R&D, AI progress will be approximately linear in compute, so e.g. 10xing compute would make progress go like 8x faster overall rather than e.g. simply adding 0.5 years of progress per year to the many years of progress per year already being produced. It seems like you disagree? Or maybe I’m just mathing wrong. Exponentials are unintuitive.
So 3X more training compute would be a 2X speed-up. Could bump to 3X speed-up due to the additional runtime compute. So overall this would make the slower tail-end of the IE happen 2-3X faster.
I.e. roughly linear.
So this does mean the IE happens faster. I.e. 10 years in 6 months rather than in 12 months.
But i was then commenting on how long it goes on for. Where i think the extra compute makes less difference between once r<1 things slow down fairly quickly. So you maybe still only get ~11 years in 12 months.
Could use the online tool to figure this out. Just do two runs, and in one of them double the ‘initial speed’. That has a similar effect to doubling compute.
Sure! Here’s a few thoughts:
Conservative take. Compute will keep growing at its current rate. Its current rate contributes 0.5 years of progress per year—because ~half of AI progress is driven by compute. So including compute growth will increase the “number of years of progress compressed into 1 year” by 0.5. This is a pretty small effect as it would just move us from (e.g.) 4 years to 4.5 years.
More aggressive take. We should also model the interaction effects. Growing compute will itself speed up software progress. How big is this effect?
If r = 0.5, then (in equilibrium) each doubling of compute drives one doubling of software. So then rather than an extra 0.5 years this is an extra 1 year.
If r >= 1 then progress is accelerating anyway. The extra compute means it will accelerate faster. How much faster? Well having 3X the compute might convert to 3X more training compute and 3X more runtime compute. My model implies that 10X more training compute gives you a 4X speed-up. (Which may be aggressive.) So 3X more training compute would be a 2X speed-up. Could bump to 3X speed-up due to the additional runtime compute. So overall this would make the slower tail-end of the IE happen 2-3X faster. So you cram more progress into 1 year before things slow down. Would need to model to know how much, but maybe the r = 0.5 estimate from above is about right (if the IE slows down when r = 0.5). So again this would be pointing to an extra year.
We do simulate this! We have a ‘gradual boost’ model variant in which compute grows exponentially in the background. Compute is ‘meant’ to stop growing when you have ASARA in the model, but you could simply choose a higher “initial boost” and you’ll be effectively simulating compute growing after ASARA. Just pick an above-ASARA capability threshold and choose your initial boost to match that.
I was surprised when I ran the simulations of this variant that there were notably more aggressive when I adjusted other params to fit the fact that the sim starts before ASARA.
I do think this model is probably the best one. It can simulate a gradual approach to ASARA and compute growth during the SIE. Probably is worth doing more sensitivity analyses on its behaviour. You can do that in the online tool!
I notice I am confused. I generally think that after full automation of AI R&D, AI progress will be approximately linear in compute, so e.g. 10xing compute would make progress go like 8x faster overall rather than e.g. simply adding 0.5 years of progress per year to the many years of progress per year already being produced. It seems like you disagree? Or maybe I’m just mathing wrong. Exponentials are unintuitive.
Think we agree on that.
My last comment says:
I.e. roughly linear.
So this does mean the IE happens faster. I.e. 10 years in 6 months rather than in 12 months.
But i was then commenting on how long it goes on for. Where i think the extra compute makes less difference between once r<1 things slow down fairly quickly. So you maybe still only get ~11 years in 12 months.
Could use the online tool to figure this out. Just do two runs, and in one of them double the ‘initial speed’. That has a similar effect to doubling compute.