Yes, they both started with the same harness but there’s room for each model to customize its own setup so I’m not sure how much they might have diverged over time. I have 4x speedup as probably an upper bound but I was only counting since the final 2.5 stable release in June, which might be too short. Gemini 2.5 has 6 badges now compared to yesterday, so it’s probably too early to assume 4x is certain. But if it was 4x every 8 months then it should be able to match average human playtime by early 2027.
“v2 centers on a smaller, flexible toolset (Notepad, Map Markers, code execution, on‑the‑fly custom agents) so Gemini can build exactly what it needs when it needs it.”
″The AI has access to a set of built-in tools to interact with the game and its own internal state:
notepad_edit: Modifies an internal notepad, allowing the AI to write down strategies, discoveries, and long-term plans.
run_code: Executes Python code in a secure, sandboxed environment for complex calculations or logic that is difficult to perform with reasoning alone.
define_map_marker / delete_map_marker: Adds or removes markers on the internal map to remember important locations like defeated trainers, item locations, or puzzle elements.
stun_npc: Temporarily freezes an NPC in place, which is useful for interacting with them.
select_battle_option: Provides a structured way to choose actions during a battle, such as selecting a move or using an item.
Custom Tools & Agents
The most powerful feature of the system is its ability to self-improve by creating its own tools and specialized agents. You can view the live Notepad and custom tools/agents tracker on GitHub.
Custom Tools (define_tool / delete_tool): If the AI identifies a repetitive or complex data-processing task, it can write and save its own Python scripts as new tools. For example, instead of relying on a pre-built pathfinder, it can write its own pathfinding tool from scratch to navigate complex areas like the spinner mazes in the Team Rocket Hideout.
Custom Agents (define_agent / delete_agent): For complex reasoning tasks, the AI can define new, specialized instances of itself without any distracting context. These agents are given a unique system prompt and purpose, allowing them to excel at specific challenges like solving puzzles or developing high-level battle strategies.”
Yes, they both started with the same harness but there’s room for each model to customize its own setup so I’m not sure how much they might have diverged over time. I have 4x speedup as probably an upper bound but I was only counting since the final 2.5 stable release in June, which might be too short. Gemini 2.5 has 6 badges now compared to yesterday, so it’s probably too early to assume 4x is certain. But if it was 4x every 8 months then it should be able to match average human playtime by early 2027.
From the Gemini_Plays_Pokemon—Twitch:
“v2 centers on a smaller, flexible toolset (Notepad, Map Markers, code execution, on‑the‑fly custom agents) so Gemini can build exactly what it needs when it needs it.”
″The AI has access to a set of built-in tools to interact with the game and its own internal state:
notepad_edit: Modifies an internal notepad, allowing the AI to write down strategies, discoveries, and long-term plans.
run_code: Executes Python code in a secure, sandboxed environment for complex calculations or logic that is difficult to perform with reasoning alone.
define_map_marker / delete_map_marker: Adds or removes markers on the internal map to remember important locations like defeated trainers, item locations, or puzzle elements.
stun_npc: Temporarily freezes an NPC in place, which is useful for interacting with them.
select_battle_option: Provides a structured way to choose actions during a battle, such as selecting a move or using an item.
Custom Tools & Agents
The most powerful feature of the system is its ability to self-improve by creating its own tools and specialized agents. You can view the live Notepad and custom tools/agents tracker on GitHub.
Custom Tools (define_tool / delete_tool): If the AI identifies a repetitive or complex data-processing task, it can write and save its own Python scripts as new tools. For example, instead of relying on a pre-built pathfinder, it can write its own pathfinding tool from scratch to navigate complex areas like the spinner mazes in the Team Rocket Hideout.
Custom Agents (define_agent / delete_agent): For complex reasoning tasks, the AI can define new, specialized instances of itself without any distracting context. These agents are given a unique system prompt and purpose, allowing them to excel at specific challenges like solving puzzles or developing high-level battle
strategies.”