I don’t have a copy of Li and Vitanyi on hand, so I can’t give you a specific section, but it’s in there somewhere (probably Ch. 3). By “it” here I mean discussion of what happens to Solomonoff induction if we treat the environment as being drawn from a distribution (i.e. having “inherent” randomness).
Neat puzzle! Let’s do the math real quick:
Suppose you have one coin with bias 0.1, and another with bias 0.9. You choose one coin at random and flip it a few times.
Before flipping, flipping 3 H and 2 T seems just as likely as flipping 2 H and 3 T, no matter the order. P(HHHTT)= P(HHTTT) = (0.5×0.93×0.12)+(0.5×0.92×0.13) = 0.00405
After your first flip, you notice that it’s a H. You now update your probability that you grabbed the heads-biased coin: P(heads bias|H) = 0.5×0.90.5 = 0.9.
Now P(HHTT|H) = (0.9×0.92×0.12)+(0.1×0.92×0.12) = 0.0081
And P(HTTT|H) = (0.1×0.93×0.1)+(0.9×0.9×0.13) = 0.0081.
Huh, that’s weird.
That’s, like, super unintuitive.
But if you look at the terms for P(HHTT|H) and P(HTTT|H), notice that they both simplify to (0.93×0.12)+(0.92×0.13). You think it’s more likely that you have the heads-biased coin, but because you know the coin must be biased, the further sequence “HHTT” isn’t as likely as the sequence “HTTT”, and both this difference in likelihood and your probability of what coin you have are the same number, the bias of the coin!
I don’t have a copy of Li and Vitanyi on hand, so I can’t give you a specific section, but it’s in there somewhere (probably Ch. 3). By “it” here I mean discussion of what happens to Solomonoff induction if we treat the environment as being drawn from a distribution (i.e. having “inherent” randomness).
Neat puzzle! Let’s do the math real quick:
Suppose you have one coin with bias 0.1, and another with bias 0.9. You choose one coin at random and flip it a few times.
Before flipping, flipping 3 H and 2 T seems just as likely as flipping 2 H and 3 T, no matter the order. P(HHHTT)= P(HHTTT) = (0.5×0.93×0.12)+(0.5×0.92×0.13) = 0.00405
After your first flip, you notice that it’s a H. You now update your probability that you grabbed the heads-biased coin: P(heads bias|H) = 0.5×0.90.5 = 0.9.
Now P(HHTT|H) = (0.9×0.92×0.12)+(0.1×0.92×0.12) = 0.0081
And P(HTTT|H) = (0.1×0.93×0.1)+(0.9×0.9×0.13) = 0.0081.
Huh, that’s weird.
That’s, like, super unintuitive.
But if you look at the terms for P(HHTT|H) and P(HTTT|H), notice that they both simplify to (0.93×0.12)+(0.92×0.13). You think it’s more likely that you have the heads-biased coin, but because you know the coin must be biased, the further sequence “HHTT” isn’t as likely as the sequence “HTTT”, and both this difference in likelihood and your probability of what coin you have are the same number, the bias of the coin!