Expected Creative Surprises

Imagine that I’m playing chess against a smarter opponent. If I could predict exactly where my opponent would move on each turn, I would automatically be at least as good a chess player as my opponent. I could just ask myself where my opponent would move, if they were in my shoes; and then make the same move myself. (In fact, to predict my opponent’s exact moves, I would need to be superhuman—I would need to predict my opponent’s exact mental processes, including their limitations and their errors. It would become a problem of psychology, rather than chess.)

So predicting an exact move is not possible, but neither is it true that I have no information about my opponent’s moves.

Personally, I am a very weak chess player—I play an average of maybe two games per year. But even if I’m playing against former world champion Garry Kasparov, there are certain things I can predict about his next move. When the game starts, I can guess that the move P-K4 is more likely than P-KN4. I can guess that if Kasparov has a move which would allow me to checkmate him on my next move, that Kasparov will not make that move.

Much less reliably, I can guess that Kasparov will not make a move that exposes his queen to my capture—but here, I could be greatly surprised; there could be a rationale for a queen sacrifice which I have not seen.

And finally, of course, I can guess that Kasparov will win the game...

Supposing that Kasparov is playing black, I can guess that the final position of the chess board will occupy the class of positions that are wins for black. I cannot predict specific features of the board in detail; but I can narrow things down relative to the class of all possible ending positions.

If I play chess against a superior opponent, and I don’t know for certain where my opponent will move, I can still endeavor to produce a probability distribution that is well-calibrated—in the sense that, over the course of many games, legal moves that I label with a probability of “ten percent” are made by the opponent around 1 time in 10.

You might ask: Is producing a well-calibrated distribution over Kasparov, beyond my abilities as an inferior chess player?

But there is a trivial way to produce a well-calibrated probability distribution—just use the maximum-entropy distribution representing a state of total ignorance. If my opponent has 37 legal moves, I can assign a probability of 137 to each move. This makes me perfectly calibrated: I assigned 37 different moves a probability of 1 in 37, and exactly one of those moves will happen; so I applied the label “1 in 37” to 37 different events, and exactly 1 of those events occurred.

Total ignorance is not very useful, even if you confess it honestly. So the question then becomes whether I can do better than maximum entropy. Let’s say that you and I both answer a quiz with ten yes-or-no questions. You assign probabilities of 90% to your answers, and get one answer wrong. I assign probabilities of 80% to my answers, and get two answers wrong. We are both perfectly calibrated but you exhibited better discrimination—your answers more strongly distinguished truth from falsehood.

Suppose that someone shows me an arbitrary chess position, and asks me: “What move would Kasparov make if he played black, starting from this position?” Since I’m not nearly as good a chess player as Kasparov, I can only weakly guess Kasparov’s move, and I’ll assign a non-extreme probability distribution to Kasparov’s possible moves. In principle I can do this for any legal chess position, though my guesses might approach maximum entropy—still, I would at least assign a lower probability to what I guessed were obviously wasteful or suicidal moves.

If you put me in a box and feed me chess positions and get probability distributions back out, then we would have—theoretically speaking—a system that produces Yudkowsky’s guess for Kasparov’s move in any chess position. We shall suppose (though it may be unlikely) that my prediction is well-calibrated, if not overwhelmingly discriminating.

Now suppose we turn “Yudkowsky’s prediction of Kasparov’s move” into an actual chess opponent, by having a computer randomly make moves at the exact probabilities I assigned. We’ll call this system RYK, which stands for “Randomized Yudkowsky-Kasparov”, though it should really be “Random Selection from Yudkowsky’s Probability Distribution over Kasparov’s Move.”

Will RYK be as good a player as Kasparov? Of course not. Sometimes the RYK system will randomly make dreadful moves which the real-life Kasparov would never make—start the game with P-KN4. I assign such moves a low probability, but sometimes the computer makes them anyway, by sheer random chance. The real Kasparov also sometimes makes moves that I assigned a low probability, but only when the move has a better rationale than I realized—the astonishing, unanticipated queen sacrifice.

Randomized Yudkowsky-Kasparov is definitely no smarter than Yudkowsky, because RYK draws on no more chess skill than I myself possess—I build all the probability distributions myself, using only my own abilities. Actually, RYK is a far worse player than Yudkowsky. I myself would make the best move I saw with my knowledge. RYK only occasionally makes the best move I saw—I won’t be very confident that Kasparov would make exactly the same move I would.

Now suppose that I myself play a game of chess against the RYK system.

RYK has the odd property that, on each and every turn, my probabilistic prediction for RYK’s move is exactly the same prediction I would make if I were playing against world champion Garry Kasparov.

Nonetheless, I can easily beat RYK, where the real Kasparov would crush me like a bug.

The creative unpredictability of intelligence is not like the noisy unpredictability of a random number generator. When I play against a smarter player, I can’t predict exactly where my opponent will move against me. But I can predict the end result of my smarter opponent’s moves, which is a win for the other player. When I see the randomized opponent make a move that I assigned a tiny probability, I chuckle and rub my hands, because I think the opponent has randomly made a dreadful move and now I can win. When a superior opponent surprises me by making a move to which I assigned a tiny probability, I groan because I think the other player saw something I didn’t, and now I’m about to be swept off the board. Even though it’s exactly the same probability distribution! I can be exactly as uncertain about the actions, and yet draw very different conclusions about the eventual outcome.

(This situation is possible because I am not logically omniscient; I do not explicitly represent a joint probability distribution over all entire games.)

When I play against a superior player, I can’t predict exactly where my opponent will move against me. If I could predict that, I would necessarily be at least that good at chess myself. But I can predict the consequence of the unknown move, which is a win for the other player; and the more the player’s actual action surprises me, the more confident I become of this final outcome.

The unpredictability of intelligence is a very special and unusual kind of surprise, which is not at all like noise or randomness. There is a weird balance between the unpredictability of actions and the predictability of outcomes.