Introducing three notable classes of model sizes (Sonnet, Opus, above-Opus) is possibly the consequence of Anthropic needing to feed datacenters with three different classes of servers during the Claude 5 lifecycle: the smaller Nvidia 8-chip servers (H100/H200/B200), rack-scale Trainium 2, and TPUv7, each being able to serve larger models than the previous one efficiently. Meanwhile, OpenAI until very recently was stuck with mostly the 8-chip Nvidia servers and so had to use smaller models (they couldn’t serve their own Opus-class model efficiently), and only now they’re getting enough GB200/GB300 Oberon racks to offer an Opus-class flagship model soon. Though the Blackwell Oberon racks are better than Trainium 2, so there’s some advantage to what OpenAI will be able to serve compared to Opus 4, all else equal. And based on GPT-5.4 (which is likely in Sonnet’s weight class), currently OpenAI might be better at RLVRing capabilities than Anthropic for models of the same size, so OpenAI’s new Opus-class model might end up notably better than Opus 4. But by that time or a bit later Opus 5 will be released, so even if these considerations are on point, it’s still unclear which of them wins in the Opus weight class during late 2026.
Based on hardware considerations, I expect the prices per token for the above-Opus class of Anthropic model will start out high, maybe 4x the price of Opus 5 (which probably won’t change much compared to Opus 4), because they’d need to serve it on suboptimal hardware initially. And then the prices go down to maybe 2x the price of Opus 5 at the end of the year once the TPUv7 datacenters go online. This is what happened with Opus 4 over 2025, as Trainium 2 datacenters came online later in the year.
If it’s the next level of pretraining compared to Opus 4 and Gemini 3 Pro, there’s potential for novel observations about what that does to the texture of capabilities. It’s the kind of thing that will predictably scale further soon without requiring algorithmic breakthroughs, and it’s not even clear that RLVR can be expected to deliver more phase changes in capabilities in the near future due to pure scaling than pretraining (even if it’s less than 1 phase change for either in expectation, until 2032 or so).