Hmmmm. Unfortunately I’m not sure what to say to this one except that in logical induction, there’s not generally a pre-existing z we can update on like that.
So that’s my real crux, and any examples with telephone calls and earthquakes etc are merely illustrative for me. (Like I said, I don’t know how to actually motivate any of this stuff except with actual logical uncertainty, and I’m surprised that any philosophers would have become convinced just from other sorts of examples.)
I agree that the logical induction case is different, since it’s hard to conceive of likelihoods to begin with. Basically, logical induction doesn’t even include what I would call virtual evidence. But many of the examples you gave do have such a z. I think I agree with your crux, and my main critique here is just in the examples of overly dogmatic Bayesian who refuses to acknowledge the difference between a and z. I won’t belabor the point further.
I’ve thought of another motivating example, BTW. In wartime, your enemy deliberately sends you some verifiably true information about their force dispositions. How should you update on that? You can’t use a Bayesian update, since you don’t actually have a likelihood model available. We can’t even attempt to learn a model from the information, since we can’t be sure its representative.
I don’t get this at all! What do you mean?
By model M, I mean an algorithm that generates likelihood functions, so M(H,Z) = P(Z|H).
So any time we talk about a likelihood P(Z|H), it should really read P(Z|H,M). We’ll posit that P(H,M) = P(H)P(M) (i.e. that the model says nothing about our priors), but this isn’t strictly necessary.
E(P(Z|H,M)) will be higher for a well calibrated model than a poorly calibrated model, which means that we expect P(H,M|Z) to also be higher. When we then marginalize over the models to get a final posterior on the hypothesis P(H|Z), it will be dominated by the well-calibrated models: P(H|Z) = SUM_i P(H|M_i,Z)P(M_i|Z).
BTW, I had a chance to read part of the ILA paper. It barely broke my brain at all! I wonder if the trick of enumerating traders and incorporating them over time could be repurposed to a more Bayesianish context, by instead enumerating models M. Like the trading firm in ILA, a meta-Bayesian algorithm could keep introducing new models M_k over time, with some intuition that the calibration of the best model in the set would improve over time, perhaps giving it all those nice anti-dutch book properties. Basically this is a computable Solomonoff induction, that slowly approaches completeness in the limit. (I’m pretty sure this is not an original idea. I wouldn’t be surprised if something like this contributed to the ILA itself).
Of course, its pretty unclear how this would work in the logical induction case. This might all be better explained in its own post.
I agree that the logical induction case is different, since it’s hard to conceive of likelihoods to begin with. Basically, logical induction doesn’t even include what I would call virtual evidence. But many of the examples you gave do have such a z. I think I agree with your crux, and my main critique here is just in the examples of overly dogmatic Bayesian who refuses to acknowledge the difference between a and z. I won’t belabor the point further.
I’ve thought of another motivating example, BTW. In wartime, your enemy deliberately sends you some verifiably true information about their force dispositions. How should you update on that? You can’t use a Bayesian update, since you don’t actually have a likelihood model available. We can’t even attempt to learn a model from the information, since we can’t be sure its representative.
By model M, I mean an algorithm that generates likelihood functions, so M(H,Z) = P(Z|H).
So any time we talk about a likelihood P(Z|H), it should really read P(Z|H,M). We’ll posit that P(H,M) = P(H)P(M) (i.e. that the model says nothing about our priors), but this isn’t strictly necessary.
E(P(Z|H,M)) will be higher for a well calibrated model than a poorly calibrated model, which means that we expect P(H,M|Z) to also be higher. When we then marginalize over the models to get a final posterior on the hypothesis P(H|Z), it will be dominated by the well-calibrated models: P(H|Z) = SUM_i P(H|M_i,Z)P(M_i|Z).
BTW, I had a chance to read part of the ILA paper. It barely broke my brain at all! I wonder if the trick of enumerating traders and incorporating them over time could be repurposed to a more Bayesianish context, by instead enumerating models M. Like the trading firm in ILA, a meta-Bayesian algorithm could keep introducing new models M_k over time, with some intuition that the calibration of the best model in the set would improve over time, perhaps giving it all those nice anti-dutch book properties. Basically this is a computable Solomonoff induction, that slowly approaches completeness in the limit. (I’m pretty sure this is not an original idea. I wouldn’t be surprised if something like this contributed to the ILA itself).
Of course, its pretty unclear how this would work in the logical induction case. This might all be better explained in its own post.