Learning as you play: anthropic shadow in deadly games

Epistemic status: I’m not a mathematician, but this is simple stuff and I think I double checked all my calculations well enough. Criticism is welcome!

Some time ago a post came out claiming that “anthropic shadow is reflectively inconsistent”. I want to argue the exact opposite thesis: anthropic shadow is a trivially true phenomenon, and it is in fact just a special case of a phenomenon that applies to a wide class of possible games, and is fundamentally unavoidable.

To put forward a formal thesis:

Given a game that possesses the following characteristics:

  • there is some aspect of the rules that is unknown to the player, and can not be deduced by observing other games (either because there are no other games, or because the rules are fully randomized every time);

  • said unknown aspect of the rules affects in some way the player’s ability to gather information about the game by playing, e.g. by abruptly ending it, or by obfuscating outcomes;

then the average player will not be able to play optimally based only on in-game observations.

It is pretty trivial stuff if one thinks about it; obviously, if the unknown rules mess with our ability to learn them, that reflective quality fundamentally rigs the game. The most obvious example of this are what I’ll call “deadly games”—namely, games in which one outcome abruptly ends the game and thus makes all your accumulated experience useless. This property of such games however is fundamental to their self-reflexive nature, and it has nothing to do with death per se. If instead of dying you are kicked out of the game, the result is the same. “Anthropic undeath” doesn’t fix this: the fundamental problem is that you start the game ignorant of some crucial information that you need to play the game optimally, and you can’t learn that information by playing because any event that would give you valuable bits to update your beliefs reflexively messes with your ability to observe it. Therefore, your sampling of a game trajectory from inside a game will always be incomplete, and if information can’t be carried from one game to another, this will be a chronic, unfixable problem. You can’t play at your best because the deck is stacked, you don’t know how, and the dealer reshuffles new cards in every time.

A game of Chinese Roulette

Let’s come up with a game that we’ll call Chinese Roulette. This game is a less hardcore version of Russian Roulette; you have a six chamber revolver, of which an unknown number of chambers are loaded. At every turn, the drum is spun and you have to fire a round; the name of the game comes from the fact that you don’t point the gun at your own head, but at an antique Chinese vase of value . Every time you pull the trigger and the chamber is empty you get a reward ; if you do fire a bullet though, the vase is irreparably destroyed and the game ends. At any point before you can choose however to quit the game and get the vase as a consolation prize.

The optimal strategy for the game is simple enough. Individual rounds are entirely uncorrelated (remember, the drum is spun again before every round), so the only strategies that make sense are “keep firing for as long as possible” or “quit immediately”. If you keep firing, your expected winnings are:

If you quit, instead, you immediately get . In other words, playing is only advantageous if

Knowing the actual amount of bullets in the gun, , is essential to know the correct strategy; without that information, we are always at risk of playing sub-optimally. If we start with a full ignorance distribution and thus assume that could take any allowed value with equal likelihood, our average expectation will be , and thus we should never play if . But suppose that instead is high enough that we do choose to play—do we ever get a chance, by playing, to get a more accurate guess at and if necessary correct our mistake? As it turns out, no.

The Bayesian approach to this is obviously to form a prior belief distribution on the possible values of and then update it, as we play, with each observed sequence of outcomes to find our posterior:

Unfortunately, if we label the possible outcomes of each pull of the trigger with either (for Empty) or (for Loaded, resulting in a shot and the vase being destroyed), the only possible sequences we can observe while still playing are , , … you get the drift. There’s no chance to see anything else; if we did, the game would be over. And if at, say, the fifth turn, there is no alternative to seeing , that means that the probability of observing it, conditioned on us still being in play, is , and entirely independent of . No information can be derived to update our prior. Whatever our ignorance at the beginning of the game, it will never change—until it’s too late for it to be of any use. This is essentially the same as any typical anthropic shadow situation; we either never observe failure, or we observe it but have no way of leveraging that information to our benefit any more. While inside the game it’s impossible to gain information about the game; when outside the game, any information we gained is useless. Based on the values of the rewards:

  • if , we can be sure that quitting is optimal;

  • if , we can be sure that playing is optimal;

  • for , our strategy will depend entirely on our prior, and if it’s wrong, there’s no way to correct it.

This frustrating situation is the purest form of anthropic shadow. But there are games in which things aren’t quite so black and white.

The only winning move is not to play—sometimes

Let’s play a more interesting version of Chinese Roulette. In this, we fix the amount of bullets in the revolver to , so that there’s a 50% of finding a loaded chamber at every turn. Bullets are replaced every turn, and they’re drawn from a bag which contains an unknown fraction of blanks. So when we fire there are in general three possible outcomes:

  • there’s a probability of that the chamber is empty, and we get our reward ;

  • there’s a probability of that the gun fires a blank; we don’t get a reward, but the game continues;

  • there’s a probability of that the gun fires a real bullet. In that case, the game ends.

We can call the fraction of real bullets . Obviously, . Then the reward for playing is

Which means that playing is advantageous if

Makes sense: the more blanks there are, the safer the game is, and the higher the value of the vase would have to be to justify quitting. The player can then try to form a belief on the value of , updating it as the game goes, and decide turn by turn whether to continue playing or quit while still ahead. We still assume total ignorance, so at the start,

Since the outcome of a non-blank loaded chamber is unobservable, there are two possible events we can observe:

  • an empty chamber , with probability ;

  • a loaded chamber with a blank, with probability .

Then the probability of a given sequence with loaded chamber events and total turns is

We can thus calculate our posterior by performing an integral:

calculated by using

in which we’ve made use of , Gauss’ hypergeometric function (by the way, check them out—hypergeometric functions are fun! - at least if you’re the kind of nerd that thinks words like “hypergeometric function” and “fun” can belong in the same sentence, like I am). So the posterior is:

And it doesn’t take much to figure out that the expectation value for is

To see how that function evolves, here is an example of a few paths:

Expectation value of based on series of events with approximately fixed ratios (for example the 25% L/​N series will have a repeated pattern).

Notice how sequences in which few or no loaded chambers are observed quickly converge to a very low guess on , and vice versa; seeing very few loaded chambers with blanks (near misses) while still in play is bad news, because it suggests we’re playing under anthropic shadow, and most loaded chambers have real bullets. Seeing an almost perfect 1:1 ratio between loaded and empty chambers however is reassuring, and suggests that maybe there’s no real bullets at all.

Working out exact formulas for the statistics of this game is hard, but fortunately it’s not too difficult to treat it numerically in polynomial time. The game can be described as a Markov process; after turns, each player will be in one of the states spanned by

The transition rules are semi-probabilistic, but can be easily split in two stages. The first stage of a turn is the trigger pull:

  • with (empty chamber)

  • with (loaded chamber, blank)

  • with (loaded chamber, real bullet)

Then the still playing players go through an additional step:

  • if their expected reward for playing, ,

One can then keep an appropriate tally of probabilities and rewards for each state, and grow the array of states as turns pass—until the total sum of players still in play falls below a certain threshold and we can declare the game solved. Using this method, I’ve produced this map of average relative rewards (average reward divided by the optimal reward for the given combination of and ) for different games.

A color map of relative rewards for players using the given strategy, and different reward ratios and percentages of blanks. The white dashed line represents the edge between the region where quitting is optimal (below) and the one where playing is optimal (above).

The map shows very well the anthropic shadow effect. The white dashed line marks , the border between the region in which quitting is optimal (below) and the one in which playing is (above). Low values of make quitting not worth much, and encourage playing; thus the area above is very dark red, but the one below veers towards blue. This is the region in which people keep playing despite the game being very unfair (low or zero percentage of blanks); it is the most typical example of anthropic shadow. Most players are eliminated quickly; the ones that keep going are unable to learn enough to realize how badly the odds are stacked against them before their luck runs out. In the end, almost no one gets to do the wise thing, and quit while they’re ahead.

The same problem in reverse happens in the top right corner. Here the consolation prize is high, and quitting is tempting; it takes only a slightly unlucky streak to decide to go for it. However, this is an area in which the game is very lenient, and ultimately, it might be worth to keep going for a while longer instead, so players lose out of excessive prudence. If instead of a precious vase it was their life on the line, though, at least they’d get to keep their head!

One thing that jumps to the eye is the “banded” look of the graph. There’s a particularly sharp transition around . I initially thought this might be a bug, but turns out it’s a natural feature of the system. Its origin is clear enough if we look at the map of how many players quit:

Percent of players who quit before losing, for the same games as the map above.

The banded structure is even more evident here. The reason for it is simple: updates on beliefs are performed discreetly, at each turn. There are only a discrete amount of possible values that the expectation can take, and especially at the beginning, when and are small, the value can swing wildly in a single turn. The bands reflect points at which such a likely early swing involving large fractions of the players’ population crosses the boundary between playing and quitting. As it turns out, for , all the players who get an empty chamber quit after the very first turn!

A matter of viewpoints

As I was working on this post though I suddenly realized something. I made my players use the strategy of estimating , and then using it to predict their expected winnings, so that they quit if:

But wait, someone might say! Wouldn’t the correct way to do this be to estimate directly your expected winnings? So essentially:

Well, it’s not so simple.

Mathematically, the integral above is a pain; it doesn’t converge. This is because as , ; if there’s no chance of breaking the vase, you can keep playing and winning forever. It’s also not solvable in closed form, that I can tell. This could be obviated by putting an upper limit (lower than , but very close) on , and then integrating numerically. It’s annoying, it probably will make the computation longer and slightly less precise, but it’s doable.

Tinkering a bit with it, I realized that (unsurprisingly, given its going-to-infinity tendencies) estimating the reward in this way always tends to give higher values than the method I’ve used. This means it leads to a policy that incentivizes risk, and playing, over quitting[1]. Qualitatively, I expect it to result in that map from the previous section getting a lot more blue below the white line and a lot more red above it.

And then it hit me that this isn’t just a mathematical conundrum—it’s a philosophical one. One kind of player tries to assess the probability of the events independently of how beneficial they are, and only later calculate (conservatively) how much they can get out of them. The other immediately weighs the expected benefits, to the point that the tiniest sliver of chance of near-infinite winnings overshadows everything else and incentivizes playing. Why, I can’t think of any debate surrounding potentially existential risks going on right now that this sort of dichotomy applies to. Can you?

Conclusions

I’ve tried to demystify the concept of anthropic shadow, separating it from actual discussions of life and death. The anthropic shadow effect is just an example of a common artifact in a certain type of games; ones in which we don’t know all the rules, and discovering them through play is hindered by the fact that different outcomes affect the length of the game. In these conditions, depending on the nature of the game, we’re pretty much doomed to play sub-optimally.

There are a few ways to escape or at least defend ourselves from the trap. Knowledge that allows at least partial leveraging of the information that we possess is possible, granted that we know some of the probabilities involved in the game in absolute terms. Having a better-than-complete-ignorance prior helps too. In both cases, we need information from a different source than playing the game, some knowledge to ground ourselves. Simple statistical observation won’t cut it; we need mechanistic knowledge to predict at least some information from first principles, so that it’ll be independent from the phenomenon that hides some outcomes from our view.

This doesn’t solve our problems. But now you know. And knowing, it turns out, is an as-of-yet unspecified fraction of the battle.

  1. ^

    Actually, this only happens if the limit for is very close to 1; if , for example, then it’s another story. But that’s a completely different game too.