MrCheeze comments on Is Gemini now better than Claude at Pokémon?

MrCheeze 21 Apr 2025 20:54 UTC
2 points
0
(Gemini did actually write much of the Gemini_Plays_Pokemon scaffolding, but only in the sense of doing what David told it to do, not designing and testing it.)
I think you’re probably right that a LLM coding its own scaffolding is probably more achievable than one playing the game like a human, but I don’t think current models can do it—watching the streams, the models don’t seem like they understand their own flaws, although admittedly they haven’t been prompted to focus on this.
- Ozyrus 22 Apr 2025 7:33 UTC
  1 point
  0
  Parent
  Not being able to do it right now is perfectly fine, still warrants setting it up to see when exactly they will start to be able to do it.