What Evid­ence Filtered Evid­ence?

Yes­ter­day I dis­cussed the di­lemma of the clever ar­guer, hired to sell you a box that may or may not con­tain a dia­mond. The clever ar­guer points out to you that the box has a blue stamp, and it is a valid known fact that dia­mond-con­tain­ing boxes are more likely than empty boxes to bear a blue stamp. What hap­pens at this point, from a Bayesian per­spect­ive? Must you help­lessly up­date your prob­ab­il­it­ies, as the clever ar­guer wishes?

If you can look at the box your­self, you can add up all the signs your­self. What if you can’t look? What if the only evid­ence you have is the word of the clever ar­guer, who is leg­ally con­strained to make only true state­ments, but does not tell you everything he knows? Each state­ment that he makes is valid evid­ence—how could you not up­date your prob­ab­il­it­ies? Has it ceased to be true that, in such-and-such a pro­por­tion of Ever­ett branches or Teg­mark du­plic­ates in which box B has a blue stamp, box B con­tains a dia­mond? Ac­cord­ing to Jaynes, a Bayesian must al­ways con­di­tion on all known evid­ence, on pain of para­dox. But then the clever ar­guer can make you be­lieve any­thing he chooses, if there is a suf­fi­cient vari­ety of signs to se­lect­ively re­port. That doesn’t sound right.

Con­sider a sim­pler case, a biased coin, which may be biased to 23 heads 13 tails, or 13 heads 23 tails, both cases be­ing equally likely a pri­ori. Each H ob­served is 1 bit of evid­ence for an H-biased coin; each T ob­served is 1 bit of evid­ence for a T-biased coin. I flip the coin ten times, and then I tell you, “The 4th flip, 6th flip, and 9th flip came up heads.” What is your pos­terior prob­ab­il­ity that the coin is H-biased?

And the an­swer is that it could be al­most any­thing, de­pend­ing on what chain of cause and ef­fect lay be­hind my ut­ter­ance of those words—my se­lec­tion of which flips to re­port.

  • I might be fol­low­ing the al­gorithm of re­port­ing the res­ult of the 4th, 6th, and 9th flips, re­gard­less of the res­ult of that and all other flips. If you know that I used this al­gorithm, the pos­terior odds are 8:1 in fa­vor of an H-biased coin.

  • I could be re­port­ing on all flips, and only flips, that came up heads. In this case, you know that all 7 other flips came up tails, and the pos­terior odds are 1:16 against the coin be­ing H-biased.

  • I could have de­cided in ad­vance to say the res­ult of the 4th, 6th, and 9th flips only if the prob­ab­il­ity of the coin be­ing H-biased ex­ceeds 98%. And so on.

Or con­sider the Monty Hall prob­lem:

On a game show, you are given the choice of three doors lead­ing to three rooms. You know that in one room is $100,000, and the other two are empty. The host asks you to pick a door, and you pick door #1. Then the host opens door #2, re­veal­ing an empty room. Do you want to switch to door #3, or stick with door #1?

The an­swer de­pends on the host’s al­gorithm. If the host al­ways opens a door and al­ways picks a door lead­ing to an empty room, then you should switch to door #3. If the host al­ways opens door #2 re­gard­less of what is be­hind it, #1 and #3 both have 50% prob­ab­il­it­ies of con­tain­ing the money. If the host only opens a door, at all, if you ini­tially pick the door with the money, then you should def­in­itely stick with #1.

You shouldn’t just con­di­tion on #2 be­ing empty, but this fact plus the fact of the host choos­ing to open door #2. Many people are con­fused by the stand­ard Monty Hall prob­lem be­cause they up­date only on #2 be­ing empty, in which case #1 and #3 have equal prob­ab­il­it­ies of con­tain­ing the money. This is why Bayesians are com­manded to con­di­tion on all of their know­ledge, on pain of para­dox.

When someone says, “The 4th coin­flip came up heads”, we are not con­di­tion­ing on the 4th coin­flip hav­ing come up heads—we are not tak­ing the sub­set of all pos­sible worlds where the 4th coin­flip came up heads—rather we are con­di­tion­ing on the sub­set of all pos­sible worlds where a speaker fol­low­ing some par­tic­u­lar al­gorithm said “The 4th coin­flip came up heads.” The spoken sen­tence is not the fact it­self; don’t be led astray by the mere mean­ings of words.

Most legal pro­cesses work on the the­ory that every case has ex­actly two op­posed sides and that it is easier to find two biased hu­mans than one un­biased one. Between the pro­sec­u­tion and the de­fense, someone has a motive to present any given piece of evid­ence, so the court will see all the evid­ence; that is the the­ory. If there are two clever ar­guers in the box di­lemma, it is not quite as good as one curi­ous in­quirer, but it is al­most as good. But that is with two boxes. Real­ity of­ten has many-sided prob­lems, and deep prob­lems, and nonob­vi­ous an­swers, which are not read­ily found by Blues and Greens scream­ing at each other.

Be­ware lest you ab­use the no­tion of evid­ence-fil­ter­ing as a Fully Gen­eral Coun­ter­ar­gu­ment to ex­clude all evid­ence you don’t like: “That ar­gu­ment was filtered, there­fore I can ig­nore it.” If you’re ticked off by a con­trary ar­gu­ment, then you are fa­mil­iar with the case, and care enough to take sides. You prob­ably already know your own side’s strongest ar­gu­ments. You have no reason to in­fer, from a con­trary ar­gu­ment, the ex­ist­ence of new fa­vor­able signs and portents which you have not yet seen. So you are left with the un­com­fort­able facts them­selves; a blue stamp on box B is still evid­ence.

But if you are hear­ing an ar­gu­ment for the first time, and you are only hear­ing one side of the ar­gu­ment, then in­deed you should be­ware! In a way, no one can really trust the the­ory of nat­ural se­lec­tion un­til after they have listened to cre­ation­ists for five minutes; and then they know it’s solid.