Richard_Ngo comments on Richard Ngo’s Shortform

Richard_Ngo 27 Nov 2025 22:08 UTC
57 points
8
In a post on Solomonoff Induction (and also in this wiki entry), Yudkowsky describes Shannon’s minimax algorithm for searching the entire chess game-tree as an example of conceptual progress. Previously, Edgar Allen Poe had argued that it was impossible in principle for a machine to play chess well. With Shannon’s algorithm, it became possible in principle, just computationally infeasible.
However, even “principled” algorithms like minimax search don’t take into account the possibility that your opponent knows things you don’t know (as I’ve previously discussed here). And so there are many elements of good chess strategy for bounded agents that they can’t account for:
- When your opponent plays a move that seems surprisingly bad, update that they’re probably seeing some tactics that you’re missing.
- When your opponent plays a move that seems surprisingly good, update that they’re probably more skilled than you thought.
- Try to play tricky (“sharp”) lines when you’re behind, in the hope that your opponent makes a mistake.
- Try to play solidly when you’re ahead, even if it narrows your lead.
- Identify your opponent’s playing style, and try to exploit it.
Some of these considerations have been discussed by Yudkowsky under the heading of “Vingean uncertainty” (though as per my post linked above I’ll talk about the more general concept of Knightian uncertainty). And I’m sure that they’ve been acknowledged elsewhere as well—though I suspect that their importance is underrated because we don’t have good technical language for describing them. To build relevant intuitions, try playing a game against LeelaPieceOdds, a chess engine designed to beat humans even when it starts down a piece (or several). Playing it is very frustrating because you constantly have the sense that you should be winning, but you don’t know how to actually make use of your advantage. I’m interested in figuring out a theory of how to do so, as a step towards building a more general theory of how to behave under Knightian uncertainty.
Here’s one approach. For each position, estimate the value of that position conditional on reaching it in your game (which I’ll call the Knightian value of the position). That is, suppose you’re playing white, and you’re evaluating a possible move W1 and a possible response B1. Then your evaluation of the position after B1 should not only incorporate your “naive” estimate of how good the position is, but also account for the fact that your opponent has deliberately chosen to play B1 knowing that it’d lead to that state.
According to Claude, this is analogous to a simple version of the Glosten-Milgrom model of bid-ask spreads on stock markets: Ask = E[Value | Buy order arrives], and Bid = E[Value | Sell order arrives]. How can you calculate those values? I’m not sure how financiers actually do it, but my immediate intuition is that you’d need a model of the opportunity cost of not investing in other things, and a model of how well-informed and “rational” other traders are, and then you’d need to somehow combine them to get an overall update.
The chess case is simpler in the sense that there’s only one adversary. But unlike in finance, there’s no “market rate” of opportunity cost. Instead, the Knightian value of every move will depend sensitively on how good the alternative moves available were. E.g. if I play a seemingly-pointless queen sacrifice when I have many other good moves available, then you should strongly suspect that it’s a trap. Whereas if I do the same in a position that is otherwise lost, you shouldn’t update nearly as much.
Unfortunately, this makes things much more complicated. In order to estimate the “naive” value of a move, we need to look at all the moves downstream of it. But estimating the Knightian value also requires evaluating the naive value of all the moves upstream of it. Basically, every Knightian estimate depends on every other part of the game-tree. Now, we might still be able to find an algorithm which does this given enough compute—just like Shannon took chess from unsolved in principle to solved given enormously implausible amounts of compute. But this seems like it would defeat the point of updating on our opponent’s intentions at all—because knowing the entire game-tree should “screen off” the relevance of their intentions. (This feels kinda analogous to how fully understanding physics, biology, sociology, etc should screen off anthropic reasoning about how many alien civilizations exist.)
—
Here’s a more promising approach. There are some positions where my estimates of their value won’t change much conditional on the game reaching them. If I can see a clear mate in 1, then it’s far more likely that my opponent made a bunch of mistakes to get to this point, than that they have some clever plan I’m not seeing (though sometimes I think I have a mate in 1 and don’t actually).
Intuitively speaking, the most sensible strategy seems to be to “bootstrap” from parts of the game-tree I can evaluate robustly, to then come up with more robust estimates of other parts of the game-tree. In order to figure out how to do that, we’ll need a more precise definition of robustness, and also some way of tying it to my estimate of my opponent’s skill level (though perhaps we could even define skill in terms of ability to win from positions that I’d thought were robustly losing? Worth thinking about.)
I don’t currently know how to do this, but I want to gesture at three ideas that might be relevant. Firstly, there’s work on imprecise probabilities—for example replacing point estimates (like “55% chance of winning from this position”) with “credal sets” (like “45-65% chance of winning from this position”). You can then replace the credal set in turn with a “plausibility function” that describes how plausible each point estimate is (with estimates outside the credal set having 0 plausibility). I suspect that there’s some relationship between plausibility and robustness in the sense I defined above—though I’m not very familiar with the literature, and so I’m mainly going off some conversations I had with davidad.
Secondly, Peter Schmidt-Nelson is trying to use symmetry considerations to demonstrate that there’s no winning strategy for black in chess. I don’t think the project itself is directly relevant to the ideas I’ve been describing here, but it feels related in a philosophical sense.
Thirdly, although I’ve been talking about the “value” of a position as if it’s a well-defined concept, it mostly isn’t. Stockfish’s value calculations are grounded in the likelihood of it winning from that position when playing itself. But there’s no clear way to translate from that to the likelihood of winning against one’s actual opponent, which is what we’re interested in. I won’t discuss this further here, but trying to pin down how to estimate a position’s value in that sense seems potentially fruitful.
—
To finish, I want to highlight an interesting analogy. Personally speaking, I’m much more interested in playing Go than in playing chess. And because Go has a many many more possible board positions than chess, the considerations above are much more salient. But more than that, they seem to play out on two different levels: both within the game-tree, and on the board itself.
Specifically, when you’re playing Go, the board typically ends up divided into territories which are loosely or securely controlled by one player or the other player. When a territory is loosely controlled, it’s often possible to “invade” it and prevent the other player from scoring points within it. The more securely a territory is controlled, however, the less likely an invasion is to succeed (and the more the invading player will weaken their other nearby territories even if they do succeed).
When a territory is small, you can try to decide whether it can be invaded by evaluating many possible sequences of moves. However, for larger territories this is prohibitively difficult. Instead, you need to use your intuition to evaluate how secure the territory is overall, perhaps by evaluating a couple of the scariest possible invasions. It’s also possible to make “probing moves” which your opponents should respond to differently depending on how secure they think their territory is.
I hypothesize that a sufficiently good theory of how to search through the chess game-tree will also tell us about, not just how to search through the Go game-tree, but also what strategies to use on the Go board itself. It’ll need to incorporate concepts like “this region of the board/game-tree is robustly my territory” and “this region is high in expected value but not very robust” and “here’s the boundary between one region and another”. The interactions between different regions of the board are more complex and nuanced than between different regions of the game-tree, but some of the same principles will likely still apply. E.g. since players have limited time, building trust in one region of the game-tree trades off against building trust in another (just as defending one region of the Go board trades off against defending another). To exploit that, you can make “probing moves” which force them to decide if they actually trust their estimate of a sharp line of play, or if they’ll retreat to a line which seems worse but more solid for them. And so on.
This all feels like a set of hints towards a potential way of quantifying Knightian uncertainty in terms of boundaries and how permeable they are to adversaries—which might then be extended to much larger-scale “games” like internal conflicts within minds or even human geopolitical conflicts. One piece of evidence that this agenda is paying off would be if we could create a “good old-fashioned AI” Go engine which can play at a superhuman level without using any neural networks—though I expect that it’s worth trying to build a significantly better theoretical understanding before embarking on that project. (Edit: maybe worth starting off with something like Connect 4 for the sake of simplicity.)
What links here?
- Richard_Ngo's comment on Why I’m not a Bayesian by Richard_Ngo (26 Dec 2025 1:40 UTC; 13 points)
- James Camacho's comment on The Joke by Ape in the coat (1 Dec 2025 15:48 UTC; 2 points)
- Mateusz Bagiński 27 Nov 2025 22:23 UTC
  6 points
  2
  Parent
  However, I think that even “principled” algorithms like minimax search are still incomplete, because they don’t take into account the possibility that your opponent knows things you don’t know
  It seems to me like you’re trying to solve a different problem. Unbounded minimax should handle all of this (in the sense that it won’t be an obstacle). Unless you are talking about bounded approximations.
  - Richard_Ngo 28 Nov 2025 2:17 UTC
    4 points
    0
    Parent
    Yeah, “incomplete” is the wrong word here. I edited shortly after posting to say instead “there are many elements of good chess strategy for bounded agents that they can’t account for”.
- Adrià Garriga-alonso 28 Nov 2025 19:38 UTC
  4 points
  0
  Parent
  
  Stockfish’s value calculations are grounded in the likelihood of it winning from that position when playing itself.
  
  Pedantically, it’s grounded in the likelihood of LeelaZero wining from that position when playing itself. Stockfish NNUE is trained on LZ data.
  
  Previously they were completely human-crafted, grounded on repeated empirical testing of Stockfish and picking what works better. I suppose they’re still grounded in the latter.
- morelight 28 Nov 2025 5:23 UTC
  4 points
  0
  Parent
  You seem to be locating two things here:
  1. the Knightian uncertainty of local positions within a game (“given this exact configuration, how much does my outcome depend on unknown opponent type, unknown payoff structure, unknown model of the world?”)
    and
  2. the Knightian uncertainty of some games relative to other games (“how much does the rule set, state space, opponent ontology, or payoff structure mutate as play continues?”)
  I suspect that the two are related, and that when we calculate the Knightian uncertainty of local positions, we will need to factor in our uncertainty with regards to the type of game we’re playing.
  For example: Tic-tac-toe has zero: the state space is fully enumerated, optimal play is strivial, every competent player forces a draw—there’s no tactical complexity to amplify mistakes.
  Chess has slightly more meta-game permeability. Players cannot change the board, cannot add new pieces, cannot redefine legal moves, cannot declare “this now counts as checkmate”, cannot invent a new win condition.
  Go has the same fixed-rule structure, but the state space explodes. The game-tree is too large to fully reason about, so bounded reasoning dominates.
  In poker/markets, rules are mostly fixed, but information asymmetry is now part of the actual game.
  In diplomacy/geopolitics, rules are soft, not hard. Players can redefine objectives and also alter the payoff structure. The game itself can change while you are playing it.
  Art has even more meta-game permeability (and therefore a higher game_uncertainty coefficient): artists invent new art forms, collapse genres, redefine what counts as art, etc. Every innovation becomes part of the game for everyone else. (Publishers in 1914-1939 to James Joyce: “This isn’t a novel.” James Joyce: “Fuck you yes it is”)
  Romance is the most Knightian domain maybe humans ever engage in. Signals are ambiguous, no stable payoff function, opponents mututate as agents (psychologically and physically) over time, one person’s move can redefine the entire domain, even the goal of the game is unclear, every interaction rewrites the other person’s internal model, the frame itself dissolves and reconstitutes with every gesture. There’s no discernable game tree – half the game is inventing what the game even is. Pickup artists understand this.
  If the domain itself is unstable, every local position inherits that instability.
  - Tic-tac-toe gives no uncertainty even in the weirdest position.
  - Romance gives knightian uncertainty even in the most trivial position.
  So, we must model the domain-level Knightian coefficient first.
  Maybe we can model the game as a decision process in domain G
  - Game G has
    State space S
    Action space A
    Rule space R
    Transition model T
    Payoff function U
    Opponent skill Theta
    And we don’t know what type of opponent we face, what payoff function is actually in play. So we have ambiguity sets within R, U and Theta
    In Tic-tac-toe, all these sets would be singletons (zero ambiguity). in romance, none of them are.
  - We want to define:
    Local ambiguity:
    a Knightian-ness factor of G that factors in my local position under some sort of distribution of opponent types and possible moves.
    “How much does the outcome of this position vary depending on unknowns?”
    Rule entropy
    We also care about the fact that some domains let you change the rules while playing. Maybe we want to define a rule-entropy over the rule space.
    “How many different rule sets are live, or could become live?”
    Meta-move ratio
    “How often do people take actions that change the rules instead of acting inside them?”
    In tic-tac-toe: M(G) = 0 (no one can legally change anything).
    In romance: M(G) is very large (people constantly renegotiate their desires, boundaries, even the “goal” of the relationship).
  - Knightian_Uncertainty(G) might be a combination of how ambiguous local positions are (given the current rule set + types), how uncertain the rules are, and how often agents actually modify those rules during play
  This is all very confusing and exciting and messy. Next, I suggest playing Connect-3, Connect-4, Connect-5, and Connect-6, and trying to understand the ways that when we crank n up, we crank up board complexity, ambiguity over oppoent’s goals, number of plausible futures, and local and global Knightian-ness up.
- bodry 29 Nov 2025 6:25 UTC
  1 point
  0
  Parent
  Thirdly, although I’ve been talking about the “value” of a position as if it’s a well-defined concept, it mostly isn’t. Stockfish’s value calculations are grounded in the likelihood of it winning from that position when playing itself. But there’s no clear way to translate from that to the likelihood of winning against one’s actual opponent, which is what we’re interested in. I won’t discuss this further here, but trying to pin down how to estimate a position’s value in that sense seems potentially fruitful.
  
  I’ll take a shot. If $E_{A l i c e} (m v s, B o b)$ is the expected return (1 for win, 0.5 for draw and 0 for loss) for Alice given that she’s playing Bob, she knows Bob’s source code, and the moves $m v s$ have been played so far, then the value of a position for Alice is $\sum_{p ϵ S} E_{A l i c e} (m v s, p) * P (p | m v s)$ where $S$ is the set of programs that Alice’s opponent is drawn from.