This can easily be “flattened” into a single, more complex, probability distribution:
25% draw white bean from mixed bag.
25% draw black bean from mixed bag.
50% draw white bean from unmixed bag.
If we wish to consider multiple draws, we can again flatten the total event into a single distribution:
1⁄8 mixed bag, black and black
1⁄8 mixed bag, black and white
1⁄8 mixed bag, white and black
1⁄8 mixed bag, white and white
1⁄2 unmixed bag, white and white
Translating the “what is that number” question into this situation, we can ask: what do we mean when we say that we are 5⁄8 sure that we will draw two white beans? I would say that it is a confidence; the “event” that has 5⁄8 probability is a partial event, a lossy description of the total event.
I’m not convinced that there’s a meaningful difference between prior distributions and prior probabilities.
There isn’t when you have only two competing hypotheses. Add a third hypothesis and
you really do have to work with distributions. Chapter 4 of Jaynes explains this wonderfully. It is a long chapter, but fully worth the effort.
But the issue is also nicely captured by your own analysis. As you show, any possible linear combination of the two hypotheses can be characterized by a single parameter, which is itself the probability that the next ball will be white. But when you have three hypotheses, you have two degrees of freedom. A single probability number no longer captures all there is to be said about what you know.
I’m not convinced that there’s a meaningful difference between prior distributions and prior probabilities.
Going back to the beans problem, we have this:
This can easily be “flattened” into a single, more complex, probability distribution:
If we wish to consider multiple draws, we can again flatten the total event into a single distribution:
Translating the “what is that number” question into this situation, we can ask: what do we mean when we say that we are 5⁄8 sure that we will draw two white beans? I would say that it is a confidence; the “event” that has 5⁄8 probability is a partial event, a lossy description of the total event.
There isn’t when you have only two competing hypotheses. Add a third hypothesis and you really do have to work with distributions. Chapter 4 of Jaynes explains this wonderfully. It is a long chapter, but fully worth the effort.
But the issue is also nicely captured by your own analysis. As you show, any possible linear combination of the two hypotheses can be characterized by a single parameter, which is itself the probability that the next ball will be white. But when you have three hypotheses, you have two degrees of freedom. A single probability number no longer captures all there is to be said about what you know.
In retrospect, it’s obvious that “probability” should refer to a real scalar on the interval [0,1].