Conditioning on Observers

Response to Beauty quips, “I’d shut up and multiply!”

Related to The Presumptuous Philosopher’s Presumptuous Friend, The Absent-Minded Driver, Sleeping Beauty gets counterfactually mugged

This is somewhat introductory. Observers play a vital role in the classic anthropic thought experiments, most notably the Sleeping Beauty and Presumptuous Philosopher gedankens. Specifically, it is remarkably common to condition simply on the existence of an observer, in spite of the continuity problems this raises. The source of confusion appears to be based on the distinction between the probability of an observer and the expectation number of observers, with the former not being a linear function of problem definitions.

There is a related difference between the expected gain of a problem and the expected gain per decision, which has been exploited in more complex counterfactual mugging scenarios. As in the case of the 12 or 13 confusion, the issue is the number of decisions that are expected to be made, and recasting problems so that there is at most one decision provides a clear intuition pump.

Sleeping Beauty

In the classic sleeping beauty problem, experimenters flip a fair coin on Sunday, sedate you and induce amnesia, and wake you either on just the following Monday or both the following Monday and Tuesday. Each time you are woken, you are asked for your credence that the coin came up heads.

The standard answers to this question are that the answer should be 12 or 13. For convenience let us say that the event W is being woken, H is that the coin flip came up heads and T is that the coin flip came up tails. The basic logic for the 12 argument is that:

P(H)=P(T)=1/​2, P(W|H) = P(W|T) = P(W) = 1 so by Bayes rule P(H|W) = 12

The obvious issue to be taken with this approach is one of continuity. The assessment is independent of the number of times you are woken in each branch, and this implies that all non zero observer branches have their posterior probability equal to their prior probability. Clearly the subjective probability of a zero observer branch is zero, so this implies discontinuity in the decision theory. Whilst not in and of itself fatal, it is surprising. There is apparent secondary confusion over the number of observations in the sleeping beauty problem, for example:

If we want to replicate the situation 1000 times, we shouldn’t end up with 1500 observations. The correct way to replicate the awakening decision is to use the probability tree I included above. You’d end up with expected cell counts of 500, 250, 250, instead of 500, 500, 500.

Under these numbers, the 1000 observations made have required 500 heads and 250 tails, as each tail produces both an observation on Monday and Tuesday. This is not the behaviour of a fair coin. Further consideration of the problem shows that the naive conditioning on W is the point where it would be expected that the number of observations comes in. Hence in 900 observations, there would be 300 heads and 300 tails, with 600 observations following a tail and 300 following a head. To make this rigorous, let Monday and Tuesday be the event of being woken on Monday and Tuesday respectively. Then:

P(H|Monday) = 12, P(Monday|W) = 23 (P(Monday|W) = 2*P(Tuesday|W) as Monday occurs regardless of coin flip)

P(H|W) = P(H ∩ Monday|W) + P(H ∩ Tuesday|W) (Total Probability)

= P(H|Monday ∩ W).P(Monday|W) + 0 (As P(Tuesday|H) = 0)

= P(H|Monday).P(Monday|W) = 13 (As Monday ∩ W = Monday)

Which would appear to support the view of updating on existence. The question of why this holds in the analysis is immediate to answer: The only day on which probability of heads occuring is non zero is Monday, and given an awakening it is not guaranteed that it is Monday. This should not be confused with the correct observation that there is always one awakening on Monday. This has caused problems because “Awakening” is not an event which occurs only once in each branch. Indeed, using the 13 answer and working back to try to find P(W) yields P(W) = 32, which is a strong indication that it is not the probability that matters, but the E(# of instances of W). As intuition pumps, we can consider some related problems.

Sleeping Twins

This experiment features Omega. It announces that it will place you and an identical copy of you in identical rooms, sedated. It will then flip a fair coin. If the coin comes up heads, it will wake one of you randomly. If it comes up tails, it will wake both of you. It will then ask what your credence for the coin coming up heads is.

You wake up in a nondescript room. What is your credence?

It is clear from the structure of this problem that it is almost identical to the sleeping beauty problem. It is also clear that your subjective probability of being woken is 12 if the coin comes up heads and 1 if it comes up tails, so conditioning on the fact that you have been woken the coin came up heads with probability 13. Why is this so different to the Sleeping Beauty problem? The fundamental difference is that in the Sleeping Twins problem, you are woken at most once, and possibly not, whereas in the Sleeping Beauty problem you are woken once or many times. On the other hand, the number of observer moments on each branch of the experiment is equal to that of the Sleeping Beauty problem, so it is odd that the manner in which these observations are achieved should matter. Clearly information flow is not possible, as provided for by amnesia in the original problem. Let us drive this further

Probabilistic Sleeping Beauty

We return to the experimenters and a new protocol. The experimenters fix a constant k in {1,2,..,20}, sedate you, roll a D20 and flip a coin. If the coin comes up tails, they will wake you on day k. If the coin comes up heads and the D20 comes up k, they will wake you on day 1. In either case they will ask you for your credence that the coin came up heads.

You wake up. What is your credence?

In this problem, the multiple distinct copies of you have been removed, at the cost of an explicit randomiser. It is clear that the structure of the problem is independent of the specific value of the constant k. It is also clear that updating on being woken, the probability that the coin came up heads is 121 regardless of k. This is troubling for the 12 answer, however, as playing this game with a single die roll and all possible values of k recovers the Sleeping Beauty problem (modulo induced amnesia). Again, having reduced the expected number of observations to be in [0,1], intuition and calculation seem to imply a reduced chance for the heads branch conditioned on being woken.

This further suggests that the misunderstanding in Sleeping Beauty is one of naively looking at P(W|H) and P(W|T), when the expected numbers of wakings are E(#W|H) = 1, E(#W|T) = 2.

The Apparent Solution

If we allow conditioning on the number of observers, we correctly calculate probabilities in the Sleeping Twins and Probabilistic Sleeping Beauty problems. It is correctly noted that a “single paying” bet is accepted in Sleeping Beauty with odds of 2; this follows naturally under the following decision schema: “If it is your last day awake the decision is binding, otherwise it is not”. Let the event of being the last day awake be L. Then:

P(L|W ∩ T) = 12, P(L|W ∩ H) = 1, the bet pays k for a cost of 1

E(Gains|Taking the bet) = (k-1) P(L|W ∩ H)P(H|W) - P(L|W ∩ T) P(T|W) = (k-1) P(H|W) - P(T|W)/​2

Clearly to accept a bet at payout of 2 implies that P(H|W) - P(T|W)/​2 ≥ 0, so 2.P(H|W) ≥ P(T|W), which contraindicates the 12 solution. The 13 solution, on the other hand works as expected. Trivially the same result holds if the choice of important decision is randomised. In general, if a decision is made by a collective of additional observers in identical states to you, then the existence of the additional observers does not change anything the overall payoffs. This can be modelled either by splitting payoffs between all decision makers in a group making identical decisions, or equivalently calculating as if there is a 1/​N chance that you dictate the decision for everyone given N identical instances of you (“Evenly distributed dictators”). To do otherwise leads to fallacious expected gains, as exploited in Sleeping Beauty gets counterfactually mugged. Of course, if the gains are linear in the number of observers, then this cancels with the division of responsibility and the observer count can be neglected, as in accepting 13 bets per observer in Sleeping Beauty.

The Absent Minded Driver

If we consider the problem of The Absent-Minded Driver, then we are faced with another scenario in which depending on decisions made there are varying numbers of observer moments in the problem. This allows an apparent time inconsistency to appear, much as in Sleeping Beauty. The problem is as follows:

You are an mildly amnesiac driver on a motorway. You notice approaching junctions but recall nothing. There are 2 junctions. If you turn off at the first, you gain nothing. If you turn off at the second, you gain 4. If you continue past the second, you gain 1.

Clearly analysis of the problem shows that if p is the probability of going forward (constant care of the amnesia), the payout is p[p+4(1-p)], maximised at p = 23. However once one the road and approaching a junction, let the probability that you are approaching the first be α. The expected gain is then claimed to be αp[p+4(1-p)]+(1-α)[p+4(1-p)] which is not maximised at 23 unless α = 1. It can be immediately noticed that given p, α = 1/​(p+1). However, this is still not correct.

Instead, we can observe that all non zero payouts are the result of two decisions, at the first and second junctions. Let the state of being at the first junction be A, and the second be B. We observe that:

E(Gains due to one decision|A) = 1 . (1-p)*0 + 12 . p[p+4(1-p)]

E(Gains due to one decision|B) = 12 . [p+4(1-p)]

P(A|W) = 1/​(p+1), P(B|W) = p/​(p+1), E(#A) = 1, E(#B) = p, (#A, #B independent of everything else)

Hence the expected gain per decision:

E(Gains due to one decision|W) = [1 . (1-p)*0 + 12 . p[p+4(1-p)]]/​(p+1) + 12 . [p+4(1-p)].p/​(p+1) = [p+4(1-p)].p/​(p+1)

But as has already been observed in this case the number of decisions made is dependent on p, and thus

E(Gains|W) = [p+4(1-p)].p , which is the correct metric. Observe also that E(Gains|A) = E(Gains|B) = p[p+4(1-p)]/​2

As a result, there is no temporal inconsistency in this problem; the approach of counting up over all observer moments, and splitting outcomes due to a set of decisions across the relevant decisions is seemingly consistent.

Sleeping Beauty gets Counterfactually Mugged

In this problem, the Sleeping Beauty problem is combined with a counterfactual mugging. If Omega flips a head, it simulates you, and if you would give it $100 it will give you $260. If it flips a tail, it asks you for $100 and if you give it to Omega, it induces amnesia and asks again the next day. On the other hand if it flips a tail and you refuse to give it money, it gives you $50.

Hence precommitting to give the money nets $30 on the average, whilst precommiting not to nets $25 on the average. However since you make exactly 1 decision on either branch if you refuse, whilst you make 3 decisions every two plays if you give Omega money, per decision you make $25 from refusing and $20 from accepting (obtained via spreading gains over identical instances of you). Hence correct play depends on whether Omega will ensure you get a consistent number of decisions or plays of the whole scenario. Given a fixed number of plays of the complete scenario, we thus have to remember to account for the increased numbers of decisions made in one branch of possible play. In this sense it is identical to the Absent Minded Driver, in that the number of decisions is a function of your early decisions, and so must be brought in as a factor in expected gains.

Alternately, from a more timeless view we can note that your decisions in the system are perfectly correlated; it is thus the case that there is a single decision made by you, to give money or not to. A decision to give money nets $30 on average, whilst a decision not to nets only $25; the fact that they are split across multiple correlated decisions is irrelevant. Alternately conditional on choosing to give money you have a 12 chance of there being a second decision, so the expected gains are $30 rather than $20.

Conclusion

The approach of using the updating on the number observer moments is comparable to UDT and other timeless approaches to decision theory; it does not care how the observers come to be, be it a single amnesiac patient over a long period or a series of parallel copies or simulations. All that matters is that they are forced to make decisions.

In cases where a number of decisions are discarded, the splitting of payouts over the decisions, or equivalently remembering the need for your decision not to be ignored, yields sane answers. This can also be considered as spreading a single pertinent decision out over some larger number of irrelevant choices.

Correlated decisions are not so easy; care must be taken when the number of decisions is dependent on behaviour.

In short, the 13 answer to sleeping beauty would appear to be fundamentally correct. Defences of the 12 answer appear to have problems with the number of observer moments being outside [0,1] and thus not being probabilities. This is the underlying danger. Use of anthropic or self indication probabilities yields sane answers in the problems considered, and can cogently answer typical questions designed to yield a non anthropic intuition.