Information theory has useful concepts for this situation (as usual). It uses the term “surprisal” as a way to quantify how surprising an outcome is. It is equal to the log of the inverse of the probability ( log (1/p) ) you had assigned to an event before you learn that it happened. [1]
What surprisal value should the judge’s statement be interpreted as meaning? One first approach would be to say that the judge means the prisoner will find the result more surprising than if he had simply assumed an equal probability to the seven days. Thus, the judge is saying that “the surprisal, or information gain, from learning your execution date will be greater than log(7).”
So, uh, how on earth are you supposed to move your probability distribution over execution days upon being given that kind of evidence? If you (wisely) start from a uniform probability distribution, you already have, in expectation, the maximum surprisal value. (Entropy is equal to the “expected” [i.e., probability-weighted] surprisal, and a uniform distribution is maximum entropy.)
No change in probability distribution can increase the expected surprisal—unless, of course, you deliberately skew your PD so that it decreases the weight on when you “really” expect to be executed. But then that brings up the messy issue of what you really believe vs. what you believe you believe.
[1]Consequently, it is equal to how much information you get upon observing the event—observing improbable events tells you more than observing probable ones. Intuitively, do you learn more from when a suspect says they’re guilty, or when they claim innocence?
Information theory has useful concepts for this situation (as usual). It uses the term “surprisal” as a way to quantify how surprising an outcome is. It is equal to the log of the inverse of the probability ( log (1/p) ) you had assigned to an event before you learn that it happened. [1]
What surprisal value should the judge’s statement be interpreted as meaning? One first approach would be to say that the judge means the prisoner will find the result more surprising than if he had simply assumed an equal probability to the seven days. Thus, the judge is saying that “the surprisal, or information gain, from learning your execution date will be greater than log(7).”
So, uh, how on earth are you supposed to move your probability distribution over execution days upon being given that kind of evidence? If you (wisely) start from a uniform probability distribution, you already have, in expectation, the maximum surprisal value. (Entropy is equal to the “expected” [i.e., probability-weighted] surprisal, and a uniform distribution is maximum entropy.)
No change in probability distribution can increase the expected surprisal—unless, of course, you deliberately skew your PD so that it decreases the weight on when you “really” expect to be executed. But then that brings up the messy issue of what you really believe vs. what you believe you believe.
[1]Consequently, it is equal to how much information you get upon observing the event—observing improbable events tells you more than observing probable ones. Intuitively, do you learn more from when a suspect says they’re guilty, or when they claim innocence?