Daniel Kokotajlo comments on Thomas Kwa’s Shortform

Daniel Kokotajlo 16 Jan 2026 14:56 UTC
LW: 14 AF: 4
1
AF
I basically agree with everything you say here and wish we had a better way to try to ground AGI timelines forecasts. Do you recommend any other method? E.g. extrapolating revenue? Just thinking through arguments about whether the current paradigm will work, and then using intuition to make the final call? We discuss some methods that appeal to us here.
This parameter, “Doubling Difficulty Growth Factor”, can change the date of the first Automated Coder AI between 2028 and 2050.
Note that we allow it to go subexponential, so actually it can change the date arbitrarily far in the future if you really want it to. Also, dunno what’s happening with Eli’s parameters, but with my parameter settings putting the doubling difficulty growth factor to 1 (i.e. pure exponential trend, neither super or sub exponential) gets to AC in 2035. (Though I don’t think we should put much weight on this number, as it depends on other parameters which are subjective & important too, such as the horizon length which corresponds to AC, which people disagree a lot about)
- Thomas Kwa 19 Jan 2026 23:29 UTC
  LW: 4 AF: 3
  0
  AF Parent
  The simple model I mentioned on Slack (still WIP, hopefully to be written up this week) tracks capability directly in terms of labor speedup and extrapolates that. Of course, for a more serious timelines forecast you have to ground it in some data.
  Here’s what I said to Eli on Slack; I don’t really have more thoughts since then
  we can get f_2026 [uplift fraction in 2026] from
  transcripts of realistic cursor usage + success judge + difficulty judge calibrated on tasks of known lengths
  uplift study
  asking lab people about their current uplift (since parallel uplift and 1/(1-f) are equivalent in the simple model)
  v [velocity of automation as capabilities improve] can be obtained by
  guessing the distribution of tasks, using time horizon, maybe using a correction factor for real vs benchmark time horizon
  multiple uplift studies over time
  comparing older models to newer ones, or having them try things people use 4.5 opus for
  listing how many things get automated each year
  - Daniel Kokotajlo 20 Jan 2026 19:27 UTC
    LW: 2 AF: 2
    0
    AF Parent
    Nice. Yeah I also am excited about coding uplift as a key metric to track that would probably make time horizons obsolete (or at least, constitute a significantly stronger source of evidence than time horizons). We at AIFP don’t have capacity to estimate the trend in uplift over time (I mean we can do small-N polls of frontier AI company employees...) but we hope someone does.