I would assume this is because wasting time (which is to the detriment of your opponent, and which he cannot control) in the first example is a not instrumental to achieving your goal. It is merely a side-effect. “Thou shall not profit from wasting time”.
If playing optimally involves making decisions that make the game go longer (such as waiting to draw additional countermagic or whatever), so be it.
That said, I’m surprised Zvi said “match wp” here—I assume this is an oversight on his part. He should just have written “game wp”.
It seems that this is what we are doing. Does not solving a difficult problem involve thinking about what kind of things you are good at, for instance? I find myself contemplating my own strengths and weaknesses and reactions (self-mdoelling) a lot while solving agency tasks, and that is how we are training llms, in the “touching of the horns”-phase (ie. RL).
Furthermore, I’ve met too many people who think of AIs like the Owners in the story do to view this as propganda.