Suppose we compare that whole function with Mr. Neumman’s function, and compare how good are the probable moves you’d make versus him making. On most chess positions, Mr. Neumann’s move would probably be better. [...] That’s the detailed complicated actually-true underlying reality that explains why the Elo system works to make excellent predictions about who beats who at chess.
This explanation is bogus. (Obviously, the conclusion that Elo scores are practically meaningful is correct, but that’s not an excuse.)
Mr. Humman could locally-validly reply that Tessa is begging the question by assuming that there’s a fact of the matter as to one move being “better” than another in a position. Whether a move is “good” depends on what the opponent does. Why can’t there be a rock-paper-scissors–like structure, where in some position, 12. …Ne4 is good against positional players and bad against tactical players?
Earlier, Tessa does appeal to player comparisons being “mostly transitive most of the time”—but only as something that “didn’t have to be true in real life”, which seems to contradict the claim that some moves in a position are better on the objective merits of the position, rather than merely with respect to the tendencies of some given population of players.
The actual detailed complicated actually-true underlying reality is that by virtue of being a finite zero-sum game, chess fulfills the conditions of the minimax theorem, which implies that there exists an inexploitable strategy. You can have rock-paper-scissors–like cycles among particular strategies, but the minimax strategy does no worse than any of them.
The implications for real-world non-perfect play are subtler. As a start, Czarnecki et al. 2020 (of Deepmind) suggest that “Real World Games Look Like Spinning Tops”: there’s a transitive “skill” dimension along which higher-skilled strategies beat lower-skilled ones, but at any given skill level, there’s a non-transitive rock-paper-scissors–like plethora of strategies, which explains how players of equal skill can nevertheless have distinctive styles. The size of the non-transitive dimension thins out as skill increases (away from the “base” of the top—see the figures in the paper).
This picture seems to suggest that rather than being total nonsense, the problem with Humman’s worldview is in his attribution of it to the “top tier”. Non-transitivity is real and significant in human life—but gradually less so as we approach the limit of optimality.
This story would have benefited from being edited by a chess player. I think one of the better players in even a “medium-small town” with “a thriving chess club as one of its central civic institutions” would know more about the game than the author seems to. (The chess writing seemed off to me, and I am significantly worse than a serious club player.)
“I thought at first it was a mistake, for you to castle so early” is a weird thing for Humman to say. Castling early is standard default beginner advice. Even if there was some unusual feature of the opening that made it a bad choice in this game, you wouldn’t use the word “so” in that sentence.
It’s weird for Assi to describe Humman’s play as using “particular tactics”, and then to (insincerely) compliment him for “doing well at one-move lookahead” and not “unforcedly throwing away material right on your next move”. Tactics are short sequences of moves that work together to achieve a goal. (An example I keep falling for in bullet games with the Englund gambit accepted (1. d4 e5?! 2. dxe5) opening: Black’s dark-square bishop is on d6, White’s queen is still on d1, the d-file is open due to accepting the Englund gambit, and Black castles queenside to put a rook on d8. Black sacrifices the Bishop with Bh2+, revealing a discovered attack of the Black rook on the White queen, which White can’t do anything about because they have to use their move to deal with the check.) If a player is at the level of using “particular tactics”, an IM who wants to complement them for social reasons shouldn’t find it difficult (to the point of giving up after “a dozen seconds of” “twist[ing] his brain around”) to find something concrete and nice to say that’s less patronizing than “at least you’re not hanging pieces.”
(Also, the Ethiopean isn’t a real opening; a cutsey fake detail like that feels out of place mixed in with real details like IMs needing an Elo of 2400, and I’d expect a club player to have heard of simuls.)
Do these flaws matter, given that the story isn’t really about chess? I argue that it does matter, because a story that is about the folly of misperceiving how high skill ladders go should take basic care to get the details right concerning the skill ladder of its notional real-world example. (An earlier draft of this comment continued, “particularly in 2025 when basic care is so cheap. In the story, Tessa has no qualms about using LLMs to fill in domain knowledge gaps; why doesn’t Yudkowsky?”, but when I checked, Claude Sonnet 4.5 didn’t anticipate my criticism.)