The frequency with which a coin comes up heads isn’t a probability, no matter how much it looks like one.
Pedantry alert: This is not technically true, although it’s still a very important point.
Every frequency is the probability of something; in this case, the frequency with which the coin comes up heads is the probability, given that you pick one of the times that the coin is flipped, that coin comes up heads that time.
But this is not the same thing as the probability that the coin comes up heads the next time that you flip it, which is what you are more likely to be interested in (and which people are liable to uselessly claim is “either 1 or 0, but I don’t know which”).
There is a probability p(heads | you picked one of those coins) and it can be found by simply taking the frequency. But the frequency still doesn’t mean the probability. In much the same way 5 / 100 balls in the jar being red isn’t a probability. It is a curious fact about the colors of balls in the jar. p(ball is red | I take a ball from that jar) is a probability.
Our think that our pedantries are clashing on the word “is”.
I’m thinking of both frequencies and probabilities as numbers, and using “is” between them if they are equal numbers. You are (I guess) thinking of frequencies and probabilities are things of different types, which are not numbers even though they may be measured by numbers.
Come to think of it, your interpretation is more pedantic than mine, so I concede.
Thinking about it further, there is no probability which is even numerically equal to the frequency. Probabilities are subjective, you know them or can work them out in your head. But you don’t know the frequency, so it can’t be equal to any of the probabilities in your head (except by coincidence).
I think that it’s a mistake to reserve the term ‘probability’ for beliefs held by actual people (or other beings with beliefs). In fact, since actual people are subject to such pervasive epistemic biases (such as we try to overcome here), I doubt that anybody (even readers of Less Wrong) holds actual beliefs that obey the mathematical laws of probability.
I prefer to think of probabiliy as the belief of an ideal rational being with given information / evidence / observations. (This makes me what they call an ‘objective Bayesian’, although really it just pushes the subjectivity back to the level of information.) So even if nobody knows the frequency with which a given coin comes up heads (which is certainly true if the coin is still around and may be flipped in the future), I can imagine a rational being who knows that frequency.
But in a post that was supposed to be pedantic, I was remiss in not specifying exactly what information the probability depends on!
I don’t understand how you can hold a position like that and still enjoy the post. How do you parse the phrase “my prior for the probability of heads” in the second example?
In the second example the person was speaking informally, but there is nothing wrong with specifying a probability distribution for an unknown parameter (and that parameter could be a probability for heads)
I hadn’t seen that, but you’re right that that sentence is wrong. “Probability” should have been replaced with “frequency” or something. A prior on a probability would be a set of probabilities of probabilities, and would soon lead to infinite regress.
I like this post, there’s still a lot of confusion around Bayesian methods.
Two things that would have helped me while I was learning Bayesianism were that:
and
I might write these into a post sometime.
*This is what’s going wrong in the heads of people who say things like “The probability is either 1 or 0, but I don’t know which.”
Pedantry alert: This is not technically true, although it’s still a very important point.
Every frequency is the probability of something; in this case, the frequency with which the coin comes up heads is the probability, given that you pick one of the times that the coin is flipped, that coin comes up heads that time.
But this is not the same thing as the probability that the coin comes up heads the next time that you flip it, which is what you are more likely to be interested in (and which people are liable to uselessly claim is “either 1 or 0, but I don’t know which”).
Even more pedantic: It still isn’t a probability.
There is a probability p(heads | you picked one of those coins) and it can be found by simply taking the frequency. But the frequency still doesn’t mean the probability. In much the same way 5 / 100 balls in the jar being red isn’t a probability. It is a curious fact about the colors of balls in the jar. p(ball is red | I take a ball from that jar) is a probability.
Our think that our pedantries are clashing on the word “is”.
I’m thinking of both frequencies and probabilities as numbers, and using “is” between them if they are equal numbers. You are (I guess) thinking of frequencies and probabilities are things of different types, which are not numbers even though they may be measured by numbers.
Come to think of it, your interpretation is more pedantic than mine, so I concede.
Thinking about it further, there is no probability which is even numerically equal to the frequency. Probabilities are subjective, you know them or can work them out in your head. But you don’t know the frequency, so it can’t be equal to any of the probabilities in your head (except by coincidence).
I think that it’s a mistake to reserve the term ‘probability’ for beliefs held by actual people (or other beings with beliefs). In fact, since actual people are subject to such pervasive epistemic biases (such as we try to overcome here), I doubt that anybody (even readers of Less Wrong) holds actual beliefs that obey the mathematical laws of probability.
I prefer to think of probabiliy as the belief of an ideal rational being with given information / evidence / observations. (This makes me what they call an ‘objective Bayesian’, although really it just pushes the subjectivity back to the level of information.) So even if nobody knows the frequency with which a given coin comes up heads (which is certainly true if the coin is still around and may be flipped in the future), I can imagine a rational being who knows that frequency.
But in a post that was supposed to be pedantic, I was remiss in not specifying exactly what information the probability depends on!
Thanks, this clears some things up for me.
You’re welcome!
I don’t understand how you can hold a position like that and still enjoy the post. How do you parse the phrase “my prior for the probability of heads” in the second example?
In the second example the person was speaking informally, but there is nothing wrong with specifying a probability distribution for an unknown parameter (and that parameter could be a probability for heads)
I hadn’t seen that, but you’re right that that sentence is wrong. “Probability” should have been replaced with “frequency” or something. A prior on a probability would be a set of probabilities of probabilities, and would soon lead to infinite regress.
only if you keep specifying hyper-priors, which there is no reason to do
Exactly. There’s no point in the first meta-prior either.