Grok 4 was able to guess my rule of “three rational numbers.” Haven’t tested out other models yet.
https://grok.com/share/c2hhcmQtMw%3D%3D_748b1b41-eda9-4619-868e-5bb4cb022d50
EDIT: Claude Opus 4 is also able to guess the rule on the first attempt.
https://claude.ai/share/4dcd8fcf-4fcb-4d48-a18f-70c56a9c4be7
GPT-5 still loses in the typical way to tic tac toe. But GPT-5-thinking does much better. It blocks the initial fork. I tested it by opening another fork rather than playing for the optimal draw and it beat me. Though its COT before the final move seems very discordant. Chat below.
https://chatgpt.com/share/68999afc-5378-8004-a9f0-588c7e2a183d
EDIT: I probably didn’t play optimally, but I let 5-thinking go first in 4x4x4 tic tac toe and it beat me.
https://chatgpt.com/share/689ab471-eca0-8004-ac34-b47b3af48c36