Me too. It’s METR who has yet to reveal anything aside from evidence extracted by Jurkovic about the models aside from C. Sonnet 4.5 (and GPT-5.1 Codex Max, but you didn’t mention it; C. Sonnet 4.5 was never SOTA to begin with and could be unusable for the graph. GPT-5.1 Codex Max had someone add the data point to the AI-2027 graph and Kokotajlo notice the likely return of the 7 month doubling trend) But I doubt that “this kind of extensive work can hardly keep up with the release of new models providing new data”, since an update of parameters would likely require mere days, if not minutes, of thinking per data point. See, e.g. Greenblatt’s quick take about the GPT-5-related forecast and my two comments there, or my post on a worrisome trend which could have been invalidated by new models.
Me too. It’s METR who has yet to reveal anything aside from evidence extracted by Jurkovic about the models aside from C. Sonnet 4.5 (and GPT-5.1 Codex Max, but you didn’t mention it; C. Sonnet 4.5 was never SOTA to begin with and could be unusable for the graph. GPT-5.1 Codex Max had someone add the data point to the AI-2027 graph and Kokotajlo notice the likely return of the 7 month doubling trend) But I doubt that “this kind of extensive work can hardly keep up with the release of new models providing new data”, since an update of parameters would likely require mere days, if not minutes, of thinking per data point. See, e.g. Greenblatt’s quick take about the GPT-5-related forecast and my two comments there, or my post on a worrisome trend which could have been invalidated by new models.
You’re right. Sonnet 4.5 was impressive at launch but the focus of AI 2027 is on coding oriented models.