I am skeptical that uplift measurements actually give much tighter confidence intervals.
I think we might already have evidence against longer timelines (2049 timelines from doubling difficulty growth factor of 1.0). Basically, the model needs to pass a backtest against where uplift was in 2025 vs now, and if there’s significant uplift now and wasn’t before 2025, this implies the automation curve is steep.
Suppose uplift at the start of 2026 was 1.6x as in the AIFM’s median (which would be 37.5% automation if you make my simplifying assumptions). If we also know that automation at start of 2025 was at most 20%, this means automation is increasing at 0.876 logits per year, or per ~8x time horizon factor, or 45x effective compute factor etc. (If d.d.g.f is 1.0, automation is logistic in both log TH and log E.) At this slope, we get to 95% AI R&D automation (or in your model, ~100% coding automation) when we hit a time horizon of 14 years, which is less than your median of 125 years and should give timelines before 2045 or so. Uplift might be increasing even faster than this, in which case timelines will be shorter. We don’t have any hard data on uplift quite yet, but I suspect that our guesses at uplift should already make us downweight longer timelines, and in Q2 or Q3 I hope we can prove it.
I would still put some weight on longer timelines, but to me this uncertainty doesn’t live in something like d.d.g.f. My understanding of the AIFM is that uncertainty in all the time horizon parameters cashes out in the effective compute required for uplift. In this frame, the remaining uncertainty lives in whether the logistic curve in log E holds—whether ease of automation in the future is similar to ease of early-stage automation we’re already observing.
It might be nice to indicate how these outputs relate to your all-things-considered views. To me your explicit model seems to be implausibly confident in 99% automation before 2040.
Yeah, my all-things-considered views are definitely more uncertain. I don’t have well-considered probabilities—I’d probably need to think about it more and perhaps build more simple models that apply in long-timelines cases.
Suppose uplift at the start of 2026 was 1.6x as in the AIFM’s median
Where are you getting this 1.6 number?
With respect to the rest of your comment, it feels to me like we have such little evidence about current uplift and what trend it follows (e.g. whether this assumption about a % automation curve that is logistic and its translation to uplift is a reasonable functional form). I’m not sure how strongly we disagree though. I’m much more skeptical of the claim that uplift can give much tighter confidence intervals than that it can give similar or slightly better ones. Again, this could change if we had much better data in a year or two.
Our median value for the coding uplift of present-day AIs at AGI companies is that having the AIs is like having 1.6 times as many software engineers (and all the staff necessary to coordinate them effectively).
As for the rest, seems reasonable. I think you can’t get around the uncertainty by modeling uplift as some more complicated function of coding automation fraction as in the AIFM, because you’re still assuming that’s logistic, we can’t measure it any better than uplift, plus we’re still uncertain how they’re related. So we really do need better data.
I think you can’t get around the uncertainty by modeling uplift as some more complicated function of coding automation fraction as in the AIFM, because you’re still assuming that’s logistic, we can’t measure it any better than uplift, plus we’re still uncertain how they’re related. So we really do need better data.
But in the AIFM the coding automation logistic is there to predict the dynamics regarding how much coding automation speeds progress pre-AC. It doesn’t have to do with setting the effective compute requirement for AC. I might be misunderstanding something, sorry if so.
Re: the 1.6 number, oh that should actually be 1.8 sorry. I think it didn’t get updated after a last minute change to the parameter value. I will fix that soon. Also, that’s the parallel uplift. In our model, the serial multiplier/uplift is sqrt(parallel uplift).
In my model it’s parallel uplift too. Effective labor (human+AI) still goes through the diminishing returns power to get to serial uplift, which I estimate as between roughly 0.1 and 0.3.
I think we might already have evidence against longer timelines (2049 timelines from doubling difficulty growth factor of 1.0). Basically, the model needs to pass a backtest against where uplift was in 2025 vs now, and if there’s significant uplift now and wasn’t before 2025, this implies the automation curve is steep.
Suppose uplift at the start of 2026 was 1.6x as in the AIFM’s median (which would be 37.5% automation if you make my simplifying assumptions). If we also know that automation at start of 2025 was at most 20%, this means automation is increasing at 0.876 logits per year, or per ~8x time horizon factor, or 45x effective compute factor etc. (If d.d.g.f is 1.0, automation is logistic in both log TH and log E.) At this slope, we get to 95% AI R&D automation (or in your model, ~100% coding automation) when we hit a time horizon of 14 years, which is less than your median of 125 years and should give timelines before 2045 or so. Uplift might be increasing even faster than this, in which case timelines will be shorter. We don’t have any hard data on uplift quite yet, but I suspect that our guesses at uplift should already make us downweight longer timelines, and in Q2 or Q3 I hope we can prove it.
I would still put some weight on longer timelines, but to me this uncertainty doesn’t live in something like d.d.g.f. My understanding of the AIFM is that uncertainty in all the time horizon parameters cashes out in the effective compute required for uplift. In this frame, the remaining uncertainty lives in whether the logistic curve in log E holds—whether ease of automation in the future is similar to ease of early-stage automation we’re already observing.
Yeah, my all-things-considered views are definitely more uncertain. I don’t have well-considered probabilities—I’d probably need to think about it more and perhaps build more simple models that apply in long-timelines cases.
Where are you getting this 1.6 number?
With respect to the rest of your comment, it feels to me like we have such little evidence about current uplift and what trend it follows (e.g. whether this assumption about a % automation curve that is logistic and its translation to uplift is a reasonable functional form). I’m not sure how strongly we disagree though. I’m much more skeptical of the claim that uplift can give much tighter confidence intervals than that it can give similar or slightly better ones. Again, this could change if we had much better data in a year or two.
I got it from your website:
As for the rest, seems reasonable. I think you can’t get around the uncertainty by modeling uplift as some more complicated function of coding automation fraction as in the AIFM, because you’re still assuming that’s logistic, we can’t measure it any better than uplift, plus we’re still uncertain how they’re related. So we really do need better data.
But in the AIFM the coding automation logistic is there to predict the dynamics regarding how much coding automation speeds progress pre-AC. It doesn’t have to do with setting the effective compute requirement for AC. I might be misunderstanding something, sorry if so.
Re: the 1.6 number, oh that should actually be 1.8 sorry. I think it didn’t get updated after a last minute change to the parameter value. I will fix that soon. Also, that’s the parallel uplift. In our model, the serial multiplier/uplift is sqrt(parallel uplift).
In my model it’s parallel uplift too. Effective labor (human+AI) still goes through the diminishing returns power to get to serial uplift, which I estimate as between roughly 0.1 and 0.3.