Daniel Kokotajlo comments on Are We In A Coding Overhang?

Daniel Kokotajlo 27 Dec 2025 9:50 UTC
29 points
1
A major plausible class of worlds in which we don’t get superhuman coders by end of 2026 is worlds where the METR trend continues at roughly the same or only slightly greater slope than the slope it had in 2025. Right?
- StanislavKrym 27 Dec 2025 11:09 UTC
  8 points
  1
  Parent
  Yes, but 2025 saw two trends: Claude 3.5 Sonnet—o3 and o3 -- GPT-5.1CodexMax with different doubling times. IIRC the earlier trend would cause superhuman coders to appear by 2028 and the later trend (which was arguably invalidated by Claude 4.5 Opus and its ~5h time horizon; see, however, two comments pointing out that the METR benchmark is no longer as trustworthy as it once was and my potential explanation of the abnormally high 50%/80% time horizon ratio) had superhuman coders arrive in 2030 or outright hit a wall^[1] before becoming superhuman.
  As for the OP’s idea that coding agents are used to improving coding agents and reaching the SC, this could be unlikely because they don’t improve the underlying LLM. I remember the now-obsolete benchmarks-and-gaps model which required the SCs not just to saturate the RE-bench, but learn to actually do long tasks and handle complex codebases, which in turn requires either a big attention span of the LLM itself or careful summarisation of each method’s specification, of formatting, of other methods’ names, etc.
  P.S. The latter scenario would be particularly difficult to predict as it might involve the time horizon in the METR sense behaving like $\frac{e^{c t}}{e^{c t_{\infty}} - e^{c t}}$ . In this case the horizon would grow ~exponentially until the very last couple of doublings.
  1. ^
    Or become neuralese with consequences as disastrous as the lack of Safer-1 to test alignment.
  What links here?
  - Petropolitan 28 Dec 2025 14:42 UTC
    9 points
    0
    Parent
    When looking for trend breaks in time series, it’s unwise to rely on eyeballing when Quandt likelihood ratio test aka sup-Wald test exists for 65 years (google it or ask an LLM to explain in layman’s terms).
    I pulled the METR data and asked Gemini 3 Flash to vibecode the test, and there is a statistically significant (peak F-statistic = 7.79 corresponding to p-value about 0.03) break at Claude 3.5 Sonnet from ~8 to ~5-month doubling but not after it
    What links here?
    StanislavKrym's comment on StanislavKrym’s Shortform by StanislavKrym (12 Jan 2026 1:07 UTC; 2 points)
- Michaël Trazzi 27 Dec 2025 17:24 UTC
  6 points
  1
  Parent
  Using @ryan_greenblatt’s updated 5-month doubling time: we reach the 1-month horizon from AI 2027 in ~5 doublings (Jan 2028) at 50% reliability, and ~8 doublings (Apr 2029) at 80% reliability. If I understand correctly, your model uses 80% reliability while also requiring 30x cheaper and faster than humans. It does seem like if the trend holds, by mid-2029 the models wouldn’t be much more expensive or slower. But I agree that if a lab tried to demonstrate “superhuman coder” on METR by the end of next year using expensive scaffolding / test-time compute (similar to o1 on ARC-AGI last year), it would probably exceed 30x human-cost, even if already 30x faster.
  What links here?
  - Are We In A Coding Overhang? by Michaël Trazzi (27 Dec 2025 8:16 UTC; 107 points)
  - Brendan Long 28 Dec 2025 1:43 UTC
    8 points
    1
    Parent
    The thing METR is measuring seems slightly different than “superhuman coder”. My understanding is that they’re dropping an AI into an unfamiliar codebase and telling it to do something with no context or design help, so this is partially software architecture partially coding. On pure coding tasks, Claude Code is clearly superhuman already.
    I spent a few hours over the last few days collaborating with Claude on design docs and some general instructions, then having it go through massive todo lists fully autonomously^[1]. This is weeks of coding and it did it in a few hours (mostly slowed down by me getting around to giving it more work).
    This is the first time I’ve had it do tasks of this scale so I’m not doing anything special, just having it propose a design, telling it which parts I want done differently, then having it make a todo list and execute it.
    ^
    Example prompt:
    Can you go through @TODO.md, delegating each task to opus subagents and ensuring that they understand all of the necessary context and implement the task, check it off, and commit it, then move onto the next task until the list is done?