Cole Wyeth comments on So how well is Claude playing Pokémon?

Cole Wyeth 4 Oct 2025 22:01 UTC
4 points
0
Update with Sonnet 4.5?
- Julian Bradshaw 5 Oct 2025 6:19 UTC
  4 points
  0
  Parent
  No published benchmark I’m aware of. The Anthropic employee that streams has updated their stream to use Sonnet 4.5, but it’s actually doing worse than Opus 4.1, which got permanently stuck in the early mid-game like every previous Claude model.