Sleeping Beauty Not Resolved

Ksvanhorn recently suggested that Radford Neal provides the solution to the Sleeping Beauty problem and that the current solutions are wrong. The consensus seems to be that the maths checks out, yet there are strong suspicions that something fishy is going on without people being able to fully articulate the issues.

My position is that Neal/​Ksavnhorn make some good critiques of existing solutions, but their solution falls short. Specifically, I’m not going to dispute Neal’s calculation, just that the probability he calculates completely misses anything that this problem may reasonably be trying to get at. Part of this is related to the classic problem, “If I have two sons and at least one of them is born on a Tuesday, what is the chance that both are born on the same day”. This leads me to strongly disfavour the way Neal extends the word probability to cover these cases, though of course I can’t actually say that he is wrong in any objective sense since words don’t have objective meanings.

A key part of his argument can be paraphrased as a suggestion to Shut up and multiply since verbal arguments about probability have a strong tendency to be misleading. I’ve run into these issues with the slipperiness of words myself, but at the same time, we need verbal arguments to decide why it is that we should construct the formalism a particular way. Further, my interpretation of Shut up and multiply has always been that that we shouldn’t engage in moral grandstanding by substituting emotions for logic. I didn’t take it to mean that when presented with a mathematical model that seems to produce sketchy results that we should accept it unquestioningly, without taking the time to understand the assumptions behind it or its intended purpose. Indeed, we’ve been told to shut and multiple for sleeping beauty before and that came to a different conclusion.

What is Neal actually doing

Unfortunately, Ksavnhorn’s post jumps straight into the maths and doesn’t provide any explanation of what is going on. This makes it somewhat harder to critique, as it means that you don’t just need the ability to follow the maths, but also the ability to figure out the actual motivation behind all of this.

Neal wants us the condition on all information, including the apparently random experiences that Sleeping Beauty will undergo before they answer the interview question. This information seems irrelevant, but Neal argues that if it were irrelevant that it wouldn’t affect the calculation. If, contrary to expectations, it actually does, then Neal would suggest that we were wrong about its irrelevance. On the other hand, I would suggest that this is a massive red flag that suggests that we don’t actually know what it is that we are calculating, as we will see in a moment.

Let S refer to experiencing a particular sequence of sensations, starting with waking up and ending with being interviewed. Neal’s strategy is to calculate the probability of S given heads and the probability of S given tails. If we are woken twice, we only need to observe S on at least one of the days for it to count and if we observe S on both days, it still only counts once. Neal uses Bayes’ Rule on the intermediate probabilities to discover the probability of heads vs. tails. Notice how incredibly simple this process was to describe in words. This is one of those situations where preventing the formalisms without an intuitive explanation of what is happening makes it much harder to understand.

Calculations

I aim to show that intermediate probabilities are mostly irrelevant. In order to do so, we will assume that there are three bits of information after awakening and before the interview (this implies 8 possibilities). Let’s suppose Sleeping Beauty awakes and then observes the sequences 111. Neal notes that in the heads case, the chance of Sleeping Beauty observing this at least once is 18. In the tails case, assuming independence, we get a probability 1/​8+1/​8-1/​64 = 1564 (or almost 14). As the number of possibilities approaches infinity, the ratio of the two probabilities approaches 1:2, which leads to slightly more than a 13 chance of heads after we perform the Bayesian update (see Ksvanhorn’s post for the maths). If we ensure that the experience stream of the second awakening never matches that of the first, we get a 28 chance of observing 111 in one of the two streams, which eventually leads to exactly a 13 chance. One the other hand, if the experience stream is always the same both before and after, we get a 18 chance of observing 111. This provides a ratio of 1:1, which leads to a 50% chance of heads.

All this maths is correct, but why do we care about these odds? It is indeed true that if you had pre-committed at the start to guess if and only if you experienced the sequence 111, then the odds of the coin being heads would be as above. This would be also true if you made the same commitment for 000 instead; or 100; or any sequence.

However, let’s suppose you picked two sequences 000 and 001 and pre-committed to guess if you saw either of those sequences. Then the odds of guessing if tails occurs and the observations are independent would become: 1/​4+1/​4-1/​16 = 716. This would lead the probability ratio to become 47. Now, the other two probabilities (always different, always the same) remain the same, but the point is that the probability of heads depends on the number of sequences you pre-commit to guess. If you pre-committed to guess regardless of the sequence, then the probability becomes 12.

Moving back to the original problem, suppose you wake up and observe 111. Why do we care about the odds of heads if you had pre-committed to only answering on observing 111, given that you didn’t pre-commit to this at all? Further, there’s no reason why you couldn’t, for example decide in advance to ignore the last bit and pre-commit if the first two were 11. Why must you pre-commit utilising all of the available randomness? Being able to manipulate your effective odds in this way by making such pre-commitments is a neat trick, but it doesn’t directly answer the question asked. Sure Ksvanhorn was able to massage this probability to produce the correct betting decisions, but both the halfer and thirder solutions can achieve this much easier.

Updating on a random bit of information

@travisrm89 wrote:

How can receiving a random bit cause Beauty to update her probability, as in the case where Beauty is an AI? If Beauty already knows that she will update her probability no matter what bit she receives, then shouldn’t she already update her probability before receiving the bit?

Ksvanhorn responds by pointing out that this assumes that the probabilities add to one, while we are considering the probability of observing a particular sequence at least once, so these probabilities overlap.

This doesn’t really clarify what is going on, but I think that we can clarify this by first looking at the following classical probability problem:

A man has two sons. What is the chance that both of them are born on the same day if at least one of them is born on a Tuesday?

(Clarifying in response to comments: the Tuesday problem is ambiguous and the answer is either 113 or 17 depending on interpretation. I’m not disputing this)

Most people expect the answer to be 17, but the usual answer is that 1349 possibilities have at least one born on a Tuesday and 149 has both born on Tuesday, so the chance in 113. Notice that if we had been told, for example, that one of them was born on a Wednesday we would have updated to 113 as well. So our odds can always update in the same way on a random piece of information if the possibilities referred to aren’t exclusive as Ksvanhorn claims.

However, consider the following similar problem:

A man has two sons. We ask one of them at random which day they were born and they tell us Tuesday. What is the chance that they are both born on the same day?

Here the answer is 17 as we’ve been given no information about when the other child was born. When Sleeping Beauty wakes up and observes a sequence, they are learning that this sequence occurs on a on a random day out of those days when they are awake. This probability is 1/​n where n is the number of possibilities. This is distinct from learning that the sequence occurs in at least one wakeup just like learning a random child is born on a Tuesday is different from learning that at least one child was born on a Tuesday. So Ksvanhorn has calculated the wrong thing.

What does this mean?

Perhaps this still indicates a limitation on the thirders’ attempts to define a notion of subjective probability? If we define probability in terms of bets, then this effect is mostly irrelevant. It only occurs when multiple guesses are collapsed down to one guess, but how often will a situation involving completely isolated situations be scored in a combined manner?

On the other hand, what does this mean for the halvers’ notion of probability where we normalise multiple guesses? Well, it shows that we can manipulate the effective probability of heads vs. tails via only guessing in particular circumstances, however, the effect is purely a result of controlling how many times we guess correctly on both days so that they only count once. Further, there are many situations where we make the correct guess on Monday, then refuse to guess on Tuesday or vice versa.

These kinds of situations fit quite awkwardly into probability theory and it seems much more logical to consider handling them in the decision theory instead.

More on the 13 solution

Neal is correct to point out that epistemic probability theory doesn’t contain a concept of “now”, so we either need to eliminate it (such as by using indexicals) or utilitise an extension of standard probability theory. Neal is correct that most 13 answers skip over this work and that this work is necessary for a formal proof. I can imagine constructing a “consciousness-state centred” probability, which handles things like repeated awakenings or duplicates. I won’t attempt to do so in this post, but I believe that such a theory is worth pursuing.

Of course, finding a useful theory of probability that covers such situations wouldn’t mean that the answer would objectively be 13, just that there is a notion of probability where this is the answer.

Neal is also right to point out that instead of updating on new information, the thirders are tossing out one model and utilising a new model. However, if we constructed a consciousness-state centred probability it would be reasonable to update based on a change of which consciousness-states are considered possibilities.

More on the 12 solution

Standard probability theory doesn’t handle being asked multiple times (it doesn’t even handle indexicals). One of the easiest ways to support this is to normalise multiple queries. For example, if we ask you twice whether you see a cat and you expect to see one 1.2 times on average, we can normalise the probability of seeing a cat to being 0.6 every query by multiplying by 0.5. If we run the sleeping beauty problem twice, you should expect to see one head with a weighting of 1 and two tails with weightings of 0.5. This provides a 50% chance of heads and a 50% chance of tails for one flip. Obviously, it would be a bit of work proving that certain standard theorems still hold, but this is a much more logical way to extend probability theory than the manner proposed.

But beyond this, if we want to pre-commit to guess on observing particular sequences of experiences, the logical choice is to pre-commit to guess on all such sequences. This then leads to the answer of 12 chance of heads if we follow Neal and collapse multiple matches into one.

Again, none of this is objective, but it all comes down to how we choose to extend classical probability theory.

Is betting a red herring?

As good as If a tree falls on Sleeping Beauty is as an article, I agree with Neil that if we merely look at bets, we haven’t reached the root of the issue. When people propose using a particular betting scheme, that scheme didn’t come out of nowhere. The betting scheme was crafted to satisfy certain properties or axioms. These axioms are the root of the issue. Here, the conflict is between counting repeated queries only once or counting them separately. Once we’ve chosen which one of these we want to include with our other axioms, the betting scheme (or rather the set of consistent betting schemes) follows. So Ksvanhorn is correct that current solutions on Less Wrong haven’t dotted all of their i’s and crossed all of their t’s. Whether this matters depends on how much you care about certainty. Again, I won’t attempt to pursue this approach in this post.

Conclusion

We’ve seen that behind all of the maths, Neil is actually performing quite a simple operation and it has very little relation to anything that we are interested in. On the other hand, the critiques of current solutions are worth taking to heart. It doesn’t imply that these are necessarily wrong, just that they aren’t formal proofs. Overall, I believe that both 12 and 13 are valid answers depending on exactly what the question is, although I have not embarked on the quest of establishing a formal footing in this post.