I’m obviously new to this whole thing, but is this a largely undebated, widely accepted view on probabilities? That there are NO situations in which you can’t meaningfully state a probability?
Actually, yes, but you’re right to be surprised because it’s (to my mind at least) an incredible result. Cox’s theorem establishes this as a mathematical result from the assumption that you want to reason quantitatively and consistently. Jaynes gives a great explanation of this in chapters 1 and 2 of his book “Probability Theory”.
But how valid is this result when we knew nothing of the original distribution?
The short answer is that a probability always reflects your current state of knowledge. If I told you absolutely nothing about the coin or the distribution, then you would be entirely justified in assigning 50% probability to heads (on the basis of symmetry). If I told you the exact distribution over p then you would be justified in assigning a different probability to heads. But in both cases I carried out the same experiment—it’s just that you had different information in the two trials. You are justified in assigning different probabilities because Probability is in the mind. The knowledge you have about the distribution over p is just one more piece of information to roll into your probability.
With this in mind, do we still believe that it’s not wrong (or less wrong? :D) to assume a normal distribution, make our calculations and decide how much you’d bet that the mean of the next 100,000 samples is in the range −100..100?
That depends on the probability that the coin flipper chooses a Cauchy distribution. If this were a real experiment then you’d have to take into account unwieldy facts about human psychology, physics of coin flips, and so on. Cox’s theorem tells us that in this case there is a unique answer in the form of a probability, but it doesn’t guarantee that we have time, resources, or inclination to actually calculate it. If you want to avoid all those kinds of complicated facts then you can start from some reasonable mathematical assumptions such as a normal distribution over p—but if your assumptions are wrong then don’t be surprised when your conclusions turn out wrong.
it doesn’t guarantee that we have time, resources, or inclination to actually calculate it
Here’s how I understand this point, that finally made things clearer:
Yes, there exists a more accurate answer, and we might even be able to discover it by investing some time. But until we do, the fact that such an answer exists is completely irrelevant. It is orthogonal to the problem.
In other words, doing the calculations would give us more information to base our prediction on, but knowing that we can do the calculation doesn’t change it in the slightest.
Thus, we are justified to treat this as “don’t know at all”, even though it seems that we do know something.
Probability is in the mind
Great read, and I think things have finally fit into the right places in my head. Now I just need to learn to guesstimate what the maximum entropy distribution might look like for a given set of facts :)
Well, that and how to actually churn out confidence intervals and expected values for experiments like this one, so that I know how much to bet given a particular set of knowledge.
Actually, yes, but you’re right to be surprised because it’s (to my mind at least) an incredible result. Cox’s theorem establishes this as a mathematical result from the assumption that you want to reason quantitatively and consistently. Jaynes gives a great explanation of this in chapters 1 and 2 of his book “Probability Theory”.
The short answer is that a probability always reflects your current state of knowledge. If I told you absolutely nothing about the coin or the distribution, then you would be entirely justified in assigning 50% probability to heads (on the basis of symmetry). If I told you the exact distribution over p then you would be justified in assigning a different probability to heads. But in both cases I carried out the same experiment—it’s just that you had different information in the two trials. You are justified in assigning different probabilities because Probability is in the mind. The knowledge you have about the distribution over p is just one more piece of information to roll into your probability.
That depends on the probability that the coin flipper chooses a Cauchy distribution. If this were a real experiment then you’d have to take into account unwieldy facts about human psychology, physics of coin flips, and so on. Cox’s theorem tells us that in this case there is a unique answer in the form of a probability, but it doesn’t guarantee that we have time, resources, or inclination to actually calculate it. If you want to avoid all those kinds of complicated facts then you can start from some reasonable mathematical assumptions such as a normal distribution over p—but if your assumptions are wrong then don’t be surprised when your conclusions turn out wrong.
Thanks for this, it really helped.
Here’s how I understand this point, that finally made things clearer:
Yes, there exists a more accurate answer, and we might even be able to discover it by investing some time. But until we do, the fact that such an answer exists is completely irrelevant. It is orthogonal to the problem.
In other words, doing the calculations would give us more information to base our prediction on, but knowing that we can do the calculation doesn’t change it in the slightest.
Thus, we are justified to treat this as “don’t know at all”, even though it seems that we do know something.
Great read, and I think things have finally fit into the right places in my head. Now I just need to learn to guesstimate what the maximum entropy distribution might look like for a given set of facts :)
Well, that and how to actually churn out confidence intervals and expected values for experiments like this one, so that I know how much to bet given a particular set of knowledge.
Cool, glad it was helpful :)
Here is one interesting post about how to encourage our brains to output specific probabilities: http://lesswrong.com/lw/3m6/techniques_for_probability_estimates/
Actually I didn’t explain that middle bit well at all. Just see http://lesswrong.com/lw/oi/mind_projection_fallacy/