mishka comments on Analyzing A Critique Of The AI 2027 Timeline Forecasts

mishka 24 Jun 2025 22:26 UTC
6 points
4
Thanks for writing this!

Of course, we are still missing METR-style evaluations measuring the ability to complete long tasks for all recent systems (Claude 4, newer versions of Gemini-2.5, and, importantly, for agentic frameworks such as OpenAI Codex, Claude Code and similar systems).

When we obtain those evaluations, we’ll have a better understand of the shape of the curve, whether the doubling period keeps shrinking, and if so, how rapidly it shrinks…