(Gemini did actually write much of the Gemini_Plays_Pokemon scaffolding, but only in the sense of doing what David told it to do, not designing and testing it.)
I think you’re probably right that a LLM coding its own scaffolding is probably more achievable than one playing the game like a human, but I don’t think current models can do it—watching the streams, the models don’t seem like they understand their own flaws, although admittedly they haven’t been prompted to focus on this.
(Gemini did actually write much of the Gemini_Plays_Pokemon scaffolding, but only in the sense of doing what David told it to do, not designing and testing it.)
I think you’re probably right that a LLM coding its own scaffolding is probably more achievable than one playing the game like a human, but I don’t think current models can do it—watching the streams, the models don’t seem like they understand their own flaws, although admittedly they haven’t been prompted to focus on this.
Not being able to do it right now is perfectly fine, still warrants setting it up to see when exactly they will start to be able to do it.