If we are trying to approximate Solomonoff induction, only the complexity in the overall description of the universe counts directly, and a universe in which thief 3 stole the diamond isn’t any more complex in terms of overall description than one in which the diamond stayed put. Instead, we account for the complexity of Bob’s specific hypothesis in terms of ordinary probability, which accounts for the fact that there are more universes which are compatible with some theories than are compatible with other theories. E.g. in this particular case there will be some base rate for theft, for a locally prominent thief being involved, etc, and we can use that to penalize Bob’s hypotheses instead. As part of that calculation, the fact that there are 4 thieves applies a factor of four penalty (2 bits) to any particular thief.
Regarding Alice’s hypotheses, I think the “the diamond spontaneously disappeared” hypothesis is actually a much larger hypothesis (in terms of bits) than you are giving it credit for. If you don’t gerrymander your descriptions to make this smaller, then the same number of bits should describe any other comparable object disappearing. Also, your bits need to specify the time of disappearance as well up to the observed precision, so the number of bits should be (ignoring additional details such as the precise manner of disappearance) around log2((number of comparable objects in universe)*(age of the universe)/(observed time window of disappearance)), which should I think be pretty decent in size.
Now, this may not be a particularly satisfying answer since I am only addressing your particular example, and not the general question of “how do low level hypotheses constrain high level ones?” AFAIK assessing how compatible any given high level hypothesis is with simple low level physics might in general be a complex issue.
(in writing the original comment, I actually wrote the second paragraph first then re-ordered them, which may have effected the consistency. I do think however it would be easy to forget to take this into account in calculating bit’s for Alice’s calculation while automatically taking it into acccount (via base rate which includes amount of thefts per time) in Bob’s calculation.)
If we are trying to approximate Solomonoff induction, only the complexity in the overall description of the universe counts directly, and a universe in which thief 3 stole the diamond isn’t any more complex in terms of overall description than one in which the diamond stayed put. Instead, we account for the complexity of Bob’s specific hypothesis in terms of ordinary probability, which accounts for the fact that there are more universes which are compatible with some theories than are compatible with other theories. E.g. in this particular case there will be some base rate for theft, for a locally prominent thief being involved, etc, and we can use that to penalize Bob’s hypotheses instead. As part of that calculation, the fact that there are 4 thieves applies a factor of four penalty (2 bits) to any particular thief.
Regarding Alice’s hypotheses, I think the “the diamond spontaneously disappeared” hypothesis is actually a much larger hypothesis (in terms of bits) than you are giving it credit for. If you don’t gerrymander your descriptions to make this smaller, then the same number of bits should describe any other comparable object disappearing. Also, your bits need to specify the time of disappearance as well up to the observed precision, so the number of bits should be (ignoring additional details such as the precise manner of disappearance) around log2((number of comparable objects in universe)*(age of the universe)/(observed time window of disappearance)), which should I think be pretty decent in size.
Now, this may not be a particularly satisfying answer since I am only addressing your particular example, and not the general question of “how do low level hypotheses constrain high level ones?” AFAIK assessing how compatible any given high level hypothesis is with simple low level physics might in general be a complex issue.
Then, doesn’t the theft hypothesis also need to account for specific timeframe where the diamond was stolen?
Yes, it would.
(in writing the original comment, I actually wrote the second paragraph first then re-ordered them, which may have effected the consistency. I do think however it would be easy to forget to take this into account in calculating bit’s for Alice’s calculation while automatically taking it into acccount (via base rate which includes amount of thefts per time) in Bob’s calculation.)