So what does Bayes’ theorem tell us about the Sleeping Beauty case?
It says that P(B|AC) = P(B|C) * P(A|BC)/P(A|C). In this case C is sleeping beauty’s information before she wakes up, which is there for all the probabilities of course. A is the “anthropic information” of waking up and learning that what used to be “AND” things are now mutually exclusive things. B is the coin landing tails.
Bayes’ theorem actually appears to break down here, if we use the simple interpretation of P(A) as “the probability she wakes up.” Because Sleeping Beauty wakes up in all the worlds, this interpretation says P(A|C) = 1, and P(A|BC) = 1, and so learning A can’t change anything.
This is very odd, and is an interesting problem with anthropics (see eliezer’s post “The Anthropic Trilemma”). The practical but difficult-to-justify way to fix it is to use frequencies, not probabilities—because she can have a average frequency of waking up of 2 or 3⁄2, while probabilities can’t go above 1.
But the major lesson is that you have to be careful about applying Bayes’ rule in this sort of situation—if you use P(A) in the calculation, you’ll get this problem.
Anyhow, only some of this a response to anything you wrote, I just felt like finishing my line of thought :P Maybe I should solve this...
Thanks… whatever the correct resolution is, violating Bayes’s Theorem seems a bit drastic!
My suspicion is that A contains indexical evidence (summarized as something like “I have just woken up as Beauty, and remember going to sleep on Sunday and the story about the coin-toss”). The indexical term likely means that P[A] is not equal to 1 though exactly what it is equal to is an interesting question.
I don’t personally have a worked-out theory about indexical probabilities, though my latest WAG is a combination of SIA and SSA, with the caveat I mentioned on infinite cases not working properly under SIA. Basically I’ll try to map it to a relative frequency problem, where all the possibilities are realised a large but finite number of times, and count P[E] as the relative frequency of observations which contain evidence E (including any indexical evidence), taking the limit where the number of observations increases to infinity. I’m not totally satisfied with that approach, but it seems to work as a calculational tool.
So what does Bayes’ theorem tell us about the Sleeping Beauty case?
It says that P(B|AC) = P(B|C) * P(A|BC)/P(A|C). In this case C is sleeping beauty’s information before she wakes up, which is there for all the probabilities of course. A is the “anthropic information” of waking up and learning that what used to be “AND” things are now mutually exclusive things. B is the coin landing tails.
Bayes’ theorem actually appears to break down here, if we use the simple interpretation of P(A) as “the probability she wakes up.” Because Sleeping Beauty wakes up in all the worlds, this interpretation says P(A|C) = 1, and P(A|BC) = 1, and so learning A can’t change anything.
This is very odd, and is an interesting problem with anthropics (see eliezer’s post “The Anthropic Trilemma”). The practical but difficult-to-justify way to fix it is to use frequencies, not probabilities—because she can have a average frequency of waking up of 2 or 3⁄2, while probabilities can’t go above 1.
But the major lesson is that you have to be careful about applying Bayes’ rule in this sort of situation—if you use P(A) in the calculation, you’ll get this problem.
Anyhow, only some of this a response to anything you wrote, I just felt like finishing my line of thought :P Maybe I should solve this...
Thanks… whatever the correct resolution is, violating Bayes’s Theorem seems a bit drastic!
My suspicion is that A contains indexical evidence (summarized as something like “I have just woken up as Beauty, and remember going to sleep on Sunday and the story about the coin-toss”). The indexical term likely means that P[A] is not equal to 1 though exactly what it is equal to is an interesting question.
I don’t personally have a worked-out theory about indexical probabilities, though my latest WAG is a combination of SIA and SSA, with the caveat I mentioned on infinite cases not working properly under SIA. Basically I’ll try to map it to a relative frequency problem, where all the possibilities are realised a large but finite number of times, and count P[E] as the relative frequency of observations which contain evidence E (including any indexical evidence), taking the limit where the number of observations increases to infinity. I’m not totally satisfied with that approach, but it seems to work as a calculational tool.