Update with Sonnet 4.5?
No published benchmark I’m aware of. The Anthropic employee that streams has updated their stream to use Sonnet 4.5, but it’s actually doing worse than Opus 4.1, which got permanently stuck in the early mid-game like every previous Claude model.
Update with Sonnet 4.5?
No published benchmark I’m aware of. The Anthropic employee that streams has updated their stream to use Sonnet 4.5, but it’s actually doing worse than Opus 4.1, which got permanently stuck in the early mid-game like every previous Claude model.