Thomas Kwa comments on Thomas Kwa’s Shortform

Thomas Kwa 6 Jan 2026 9:49 UTC
2 points
0
(in the usual units, this means that the plot of log(2025-FLOP per FLOP) vs log(researcher-hours) is a straight line with slope $r$ .) A plot that curves downward or “hits a wall” seems like evidence against this model’s applicability to the data.
Note there are no log-log plots in the data. They’re performance vs LoC and log(performance) vs LoC, and same for stars. I don’t think we’re at an absolute ceiling since two more improvements came out in the past week, they’ve just gotten smaller and taken more code to implement.
I need to think about this algorithmic progress being 10x/year thing. It feels like some assumptions are violated with how much the data seem to give inconsistent answers, maybe there’s a prospective vs retrospective difference. Or do you think progress has just sped up in the past couple of years?
- bhalstead 7 Jan 2026 5:09 UTC
  1 point
  0
  Parent
  Progress probably has sped up in the past couple of years. And training compute scaling has, if anything, slowed down (it hasn’t accelerated, anyway). So yes, I think “software progress” probably has sped up in the past couple of years.
  
  I haven’t looked into whether you can see the algorithmic progress speedup in the ECI data using the methodology I was describing. The data would be very sparse if you e.g. tried to restrict to pre-2024 models for greater alignment with the Algorithmic Progress in Language Models paper, which is where the 3x per year number comes from.
  
  Also, that 3x per year number is only measuring pre-training improvements. Post-training (1) didn’t really exist before 2022 and (2) was notably accelerated in 2024 by the introduction of RLVR. I wouldn’t be confident in whether pre-training algorithmic progress alone is much faster than 3x per year today. (as rumor would have it, there’s substantial divergence between the different AGI companies on the rate of pretraining progress.)