First, “A and B are independent” is not a reasonable prior, because it assigns probability 0 to them being dependent in some way
This raises a question of the meaningfuless of second-order Bayesian reasoning. Suppose I had a prior for the probability of some event C of, say, 0.469. Could one object to that, on the grounds that I have assigned a probability of zero to the probability of C being some other value? A prior of independence of A and B seems to me of a like nature to an assignment of a probability to C.
On the second point, seeing A and B together twice, or twenty times, tells me nothing about their independence. Almost everyone has two eyes and two legs, and therefore almost everyone has both two eyes and two legs, but it does not follow from those observations alone that possession of two eyes either is, or is not, independent of having two legs. For example, it is well-known (in some possible world) that the rare grey-green greasy Limpopo bore worm invariably attacks either the eyes, or the legs, but never both in the same patient, and thus observing someone walking on healthy legs conveys a tiny positive amount of probability that they have no eyes; while (in another possible world) the venom of the giant rattlesnake of Sumatra rapidly causes both the eyes and the legs of anyone it bites to fall off, with the opposite effect on the relationship between the two misfortunes. I can predict that someone has both two eyes and two legs from the fact that they are a human being. The extra information about their legs that I gain from examining their eyes could go either way.
But that is just an intuitive ramble. What is needed here is a calculation, akin to the Laplace rule of succession, for observations in a 2x2 contingency table. Starting from an ignorance prior that the probabilities of A&B, A&~B, B&~A, and ~A&~B are each 1⁄4, and observing a, b, c, and d examples of each, what is the appropriate posterior? Then fill in the values 2, 0, 0, and 0.
ETA: On reading the comments, I realise that the above is almost all wrong.
This raises a question of the meaningfuless of second-order Bayesian reasoning. Suppose I had a prior for the probability of some event C of, say, 0.469. Could one object to that, on the grounds that I have assigned a probability of zero to the probability of C being some other value? A prior of independence of A and B seems to me of a like nature to an assignment of a probability to C.
In order to have a probability distribution rather than just a probability, you need to ask a question that isn’t boolean, ie one with more than two possible answers. If you ask “Will this coin come up heads on the next flip?”, you get a probability, because there are only two possible answers. If you ask “How many times will this coin come up heads out of the next hundred flips?”, then you get back a probability for each number from 0 to 100 - that is, a probability distribution. And if you ask “what kind of coin do I have in my pocket?”, then you get a function that takes any possible description (from “copper” to “slightly worn 1980 American quarter”) and returns a probability of matching that description.
Suppose I had a prior for the probability of some event C of, say, 0.469. Could one object to that, on the grounds that I have assigned a probability of zero to the probability of C being some other value?
Depends on how you’re doing this; if you have a continuous prior for the probability of C, with an expected value of 0.469, then no— and future evidence will continue to modify your probability distribution. If your prior for the probability of C consists of a delta mass at 0.469, then yes, your model perhaps should be criticized, as one might criticize Rosenkrantz for continuing to assume his coin is fair after 30 consecutive heads.
A Bayesian reasoner actually would have a hierarchy of uncertainty about every aspect of ver model, but the simplicity weighting would give them all low probabilities unless they started correctly predicting some strong pattern.
A prior of independence of A and B seems to me of a like nature to an assignment of a probability to C.
Independence has a specific meaning in probability theory, and it’s a very delicate state of affairs. Many statisticians (and others) get themselves in trouble by assuming independence (because it’s easier to calculate) for variables that are actually correlated.
And depending on your reference class (things with human DNA? animals? macroscopic objects?), having 2 eyes is extremely well correlated with having 2 legs.
On the second point, seeing A and B together twice, or twenty times, tells me nothing about their independence.
Even without any math It already tells you that they are not mutually exclusive. See wnoise’s reply to the grandparent post for the Laplace rule equivalent.
This raises a question of the meaningfuless of second-order Bayesian reasoning. Suppose I had a prior for the probability of some event C of, say, 0.469. Could one object to that, on the grounds that I have assigned a probability of zero to the probability of C being some other value? A prior of independence of A and B seems to me of a like nature to an assignment of a probability to C.
On the second point, seeing A and B together twice, or twenty times, tells me nothing about their independence. Almost everyone has two eyes and two legs, and therefore almost everyone has both two eyes and two legs, but it does not follow from those observations alone that possession of two eyes either is, or is not, independent of having two legs. For example, it is well-known (in some possible world) that the rare grey-green greasy Limpopo bore worm invariably attacks either the eyes, or the legs, but never both in the same patient, and thus observing someone walking on healthy legs conveys a tiny positive amount of probability that they have no eyes; while (in another possible world) the venom of the giant rattlesnake of Sumatra rapidly causes both the eyes and the legs of anyone it bites to fall off, with the opposite effect on the relationship between the two misfortunes. I can predict that someone has both two eyes and two legs from the fact that they are a human being. The extra information about their legs that I gain from examining their eyes could go either way.
But that is just an intuitive ramble. What is needed here is a calculation, akin to the Laplace rule of succession, for observations in a 2x2 contingency table. Starting from an ignorance prior that the probabilities of A&B, A&~B, B&~A, and ~A&~B are each 1⁄4, and observing a, b, c, and d examples of each, what is the appropriate posterior? Then fill in the values 2, 0, 0, and 0.
ETA: On reading the comments, I realise that the above is almost all wrong.
In order to have a probability distribution rather than just a probability, you need to ask a question that isn’t boolean, ie one with more than two possible answers. If you ask “Will this coin come up heads on the next flip?”, you get a probability, because there are only two possible answers. If you ask “How many times will this coin come up heads out of the next hundred flips?”, then you get back a probability for each number from 0 to 100 - that is, a probability distribution. And if you ask “what kind of coin do I have in my pocket?”, then you get a function that takes any possible description (from “copper” to “slightly worn 1980 American quarter”) and returns a probability of matching that description.
Depends on how you’re doing this; if you have a continuous prior for the probability of C, with an expected value of 0.469, then no— and future evidence will continue to modify your probability distribution. If your prior for the probability of C consists of a delta mass at 0.469, then yes, your model perhaps should be criticized, as one might criticize Rosenkrantz for continuing to assume his coin is fair after 30 consecutive heads.
A Bayesian reasoner actually would have a hierarchy of uncertainty about every aspect of ver model, but the simplicity weighting would give them all low probabilities unless they started correctly predicting some strong pattern.
Independence has a specific meaning in probability theory, and it’s a very delicate state of affairs. Many statisticians (and others) get themselves in trouble by assuming independence (because it’s easier to calculate) for variables that are actually correlated.
And depending on your reference class (things with human DNA? animals? macroscopic objects?), having 2 eyes is extremely well correlated with having 2 legs.
Even without any math It already tells you that they are not mutually exclusive. See wnoise’s reply to the grandparent post for the Laplace rule equivalent.