Monty Hall in the Wild

Cross-posted from Pu­tanu­

I vis­ited a friend yes­ter­day, and af­ter we finished our first bot­tle of Mer­lot I brought up the Monty Hall prob­lem, as one does. My friend con­fessed that she has never heard of the prob­lem. Here’s how it goes:

You are faced with a choice of three doors. Be­hind one of them is a shiny pile of utilons, while the other two are hid­ing av­o­ca­dos that are just start­ing to get rot­ten and maybe you can con­vince your­self that they’re still OK to eat but you’ll im­me­di­ately re­gret it if you try. The origi­nal for­mu­la­tion talks about a car and two goats, which always con­fused me be­cause goats are bet­ter for rac­ing than cars.

Any­way, you point to Door A, at which point the host of the show (yeah, it’s a show, you’re on TV) opens one of the other doors to re­veal an al­most-rot­ten av­o­cado. The host knows what hides be­hind each door, always offers the switch, and never opens the one with the prize be­cause that would make for bor­ing TV.

You now have the op­tion of stick­ing with Door A or switch­ing to the third door that wasn’t opened. Should you switch?

After my friend figured out the an­swer she com­mented that:

  1. The cor­rect an­swer is ob­vi­ous.

  2. The prob­lem is pretty bor­ing.

  3. Monty Hall isn’t rele­vant to any­thing you would ever come across in real life.

Wrong, wrong, and wrong.

Ac­cord­ing to Wikipe­dia, only 13% of peo­ple figure out that switch­ing doors im­proves your chance of win­ning from 13 to 23. But even more in­ter­est­ing is the fact that an in­cred­ible num­ber of peo­ple re­main un­con­vinced even af­ter see­ing sev­eral proofs, simu­la­tions, and demon­stra­tions. The wrong an­swer, that the doors have an equal 12 chance of av­o­cado, is so in­tu­itively ap­peal­ing that even ed­u­cated peo­ple can­not over­come that in­tu­ition with math­e­mat­i­cal rea­son.

I’ve writ­ten a lot re­cently about de­cou­pling. At the heart of de­cou­pling is the abil­ity to over­ride an in­tu­itive Sys­tem 1 an­swer with a Sys­tem 2 an­swer that’s ar­rived at by ap­ply­ing logic and rules. This abil­ity to over­ride is what’s mea­sured by the Cog­ni­tive Reflec­tion Test, the stan­dard test of ra­tio­nal­ity. Re­mark­ably, many peo­ple fail to im­prove on the CRT even af­ter tak­ing it mul­ti­ple times.

When ra­tio­nal­ists talk about “me” and “my brain”, the former refers to their Sys­tem 2 and the lat­ter to their Sys­tem 1 and their un­con­scious. “I wanted to get some work done, but my brain­de­cided it needs to scroll through Twit­ter for an hour.” But in al­most any other group, “my brain” is Sys­tem 2, and what peo­ple iden­tify with is their in­tu­ition.

Ra­tion­al­ists of­ten un­der­es­ti­mate the sheer in­abil­ity of many peo­ple to take this first step to­wards ra­tio­nal­ity, of re­al­iz­ing that the first an­swer that pops into their heads could be wrong on re­flec­tion. I’ve talked to peo­ple who were ut­terly con­fused by the sug­ges­tion that their gut feel­ings may not cor­re­spond perfectly to facts. For some­one who puts a lot of weight on Sys­tem 2, it’s re­mark­able to see peo­ple for whom it may as well not ex­ist.

You know who does re­ally well on the Monty Hall prob­lem? Pi­geons. At least, pi­geons quickly learn to switch doors when the game is re­peated mul­ti­ple times over 30 days and they can ob­serve that switch­ing doors is twice as likely to yield the prize.

This isn’t some bias in fa­vor of switch­ing ei­ther, be­cause when the con­di­tion is re­versed so the prize is made to be twice as likely to ap­pear be­hind the door that was origi­nally cho­sen, the pi­geons up­date against switch­ing just as quickly:

Re­mark­ably, hu­mans don’t up­date at all on the iter­ated game. When switch­ing is bet­ter, a third of peo­ple re­fuse to switch no mat­ter how long the game is re­peated:

And when switch­ing is worse, a third of hu­mans keep switch­ing:

I first saw this chart when my wife was giv­ing a talk on pi­geon cog­ni­tion (I re­ally mar­ried well). I im­me­di­ately be­came cu­ri­ous about the ex­act sort of hu­man who can yield a chart like that. The study these are taken from is ti­tled Are Birds Smarter than Math­e­mat­i­ci­ans?, but the re­spon­dents were not math­e­mat­i­ci­ans at all.

Failing to im­prove one iota in the re­peated game re­quires a par­tic­u­lar sort of dys­ra­tiona­lia, where you’re so cer­tain of your math­e­mat­i­cal in­tu­ition that mount­ing ev­i­dence to the con­trary only causes you to dou­ble down on your wrong­ness. Other stud­ies have shown that peo­ple who don’t know math at all, like lit­tle kids, quickly up­date to­wards switch­ing. The re­luc­tance to switch can only come from an ex­treme Dun­ning-Kruger effect. Th­ese are peo­ple whose in­abil­ity to do math is matched only by their cer­tainty in their own math­e­mat­i­cal skill.

This lit­tle screen­shot tells you ev­ery­thing you need to know about aca­demic psy­chol­ogy:

  1. The study had a sam­ple size of 13, but I bet that the grant pro­posal still claimed that the power was 80%.

  2. The ti­tle mis­lead­ingly talked about math­e­mat­i­ci­ans, even though not a sin­gle math­e­mat­i­cian was harmed when perform­ing the ex­per­i­ment.

  3. A big chunk of the psy­chol­ogy un­der­grads could nei­ther figure out the op­ti­mal strat­egy nor learn it from dozens of re­peated games..

Monty Hall is a cor­ner­stone puz­zle for ra­tio­nal­ity not just be­cause it has an in­tu­itive an­swer that’s wrong, but also be­cause it demon­strates a key of Bayesian think­ing: the im­por­tance of coun­ter­fac­tu­als. Bayes’ law says that you must up­date not only on what ac­tu­ally hap­pened, but also on what could have hap­pened.

There are no coun­ter­fac­tu­als rele­vant to Door A, be­cause it never could have been opened. In the world where Door A con­tains the prize noth­ing hap­pens to it that doesn’t hap­pen in the world where it smells of av­o­cado. That’s why ob­serv­ing that the host opens Door B doesn’t change the prob­a­bil­ity of Door A be­ing a win­ner: it stays at 13.

But for Door C, you knew that there was a chance it would be opened, but it wasn’t. And the chance of Door C be­ing opened de­pends on what it con­tains: 0% of be­ing opened if it hides the prize, 75% if it’s the av­o­cado. Even be­fore calcu­lat­ing the num­bers ex­actly, this tells you that ob­serv­ing the door that is opened should make you up­date the odds of Door C.

Con­tra my friend, Monty Hall-es­que logic shows up in a lot of places if you know how to no­tice it.

Here’s an in­tu­itive ex­am­ple: you en­ter a yearly perfor­mance re­view with your cur­mud­geonly boss who loves to point out your faults. She com­plains that you always sub­mit your ex­pense re­ports late, which an­noys the HR team. Is this good or bad news?

It’s ex­cel­lent news! Your boss is a lot more likely to com­plain about some minor de­tail if you’re do­ing great on ev­ery­thing else, like ac­tu­ally get­ting the work done with your team. The fact that she showed you a stink­ing av­o­cado be­hind the “ex­pense re­port timeli­ness” door means that the other doors are likely praise­wor­thy.

Another ex­am­ple – there are 9 French restau­rants in town: Chez Ar­naud, Chez Bernard, Chez Claude etc. You ask your friend who has tried them all which one is the best, and he in­forms you that it’s Chez Ja­cob. What does this mean for the other restau­rants?

It means that the other eight restau­rants are slightly worse than you ini­tially thought. The fact that your friend could’ve picked any of them as the best but didn’t is ev­i­dence against them be­ing great.

To put some num­bers on it, your ini­tial model could be that the nine restau­rants are equally likely to oc­cupy any per­centile of qual­ity among French restau­rants in the coun­try. If per­centiles are con­fus­ing, imag­ine that any restau­rant is equally likely to get any score of qual­ity be­tween 0 and 100. You’d ex­pect any of the nine to be the 50th per­centile restau­rant, or to score 50100, on av­er­age. Learn­ing that Chez Ja­cob is the best means that you should ex­pect it to score 90, since the max­i­mum of N in­de­pen­dent vari­ables dis­tributed uniformly over [0,1] has an ex­pected value of N/​N+1. You should up­date that the av­er­age of the eight re­main­ing restau­rants scores 45.

The short­cut to figur­ing that out is con­sid­er­ing some prob­a­bil­ity or quan­tity that is con­served af­ter you made your ob­ser­va­tion. In Monty Hall, the quan­tity that is con­served is the 23 chance that the prize is be­hind one of the doors B or C (be­cause the chance of Door A is 13 and is in­de­pen­dent of the ob­ser­va­tion that an­other door is opened). After Door B is opened, the chance of it con­tain­ing the prize goes down by 13, from 13 to 0. This means that the chance of Door C should in­crease by 13, from 13 to 23.

Similarly, be­ing told that Chez Ja­cob is the best up­graded it from 50 to 90, a jump of 40 per­centiles or points. But your ex­pec­ta­tion that the av­er­age of all nine restau­rants in town is 50 shouldn’t change, as­sum­ing that your friend was go­ing to pick out one best restau­rant re­gard­less of the over­all qual­ity. Since Chez Ja­cob gained 40 points, the other eight restau­rants have to lose 5 points each to keep the av­er­age the same. Thus, your pos­te­rior ex­pec­ta­tion for them went down from 50 to 45.

Here’s one more ex­am­ple of coun­ter­fac­tual-based think­ing. It’s not par­allel to Monty Hall at first glance, but the same Bayesian logic un­der­pins both of them.

I’ve spo­ken to two women who write in­ter­est­ing things but are hes­i­tant to post them on­line be­cause peo­ple tell them they suck and call them $&#@s. They asked me how I deal with be­ing called a $&#@ on­line. My an­swer is that I re­al­ized that most of the time when some­one calls me a $&#@ on­line I should up­date not that I may be a $&#@, but only that I’m get­ting more fa­mous.

There are two kinds of peo­ple who can pos­si­bly re­spond to some­thing I’ve writ­ten by tel­ling me I’m a $&#@:

  1. Peo­ple who read all my stuff and are usu­ally pos­i­tive, but in this case thought I was a $&#@.

  2. Peo­ple who go around call­ing oth­ers $&#@s on­line, and in this case just hap­pened to click on Pu­tanu­monit.

In the first case, the coun­ter­fac­tual to be­ing called a $&#@ is get­ting a com­pli­ment, which should make think that I $&#@ed up with that par­tic­u­lar es­say. When I get nega­tive com­ments from reg­u­lar read­ers, I up­date.

In the sec­ond case, the coun­ter­fac­tual to me be­ing called a $&#@ is sim­ply some­one else be­ing called a $&#@. My up­date is sim­ply that my es­say has got­ten widely shared, which is great.

For ex­am­ple, here’s a com­ment that starts off with “Fuck you Aspie scum”. The com­ment has noth­ing to do with my ac­tual post, I’m pretty sure that its au­thor just copy-pastes it on ra­tio­nal­ist blogs that he hap­pens to come across. Given this, I found the com­ment to be a pos­i­tive sign of the ex­pand­ing reach of Pu­tanu­monit. It did noth­ing to make me worry that I’m an Aspie scum who should be fucked.

I find this sort of Bayesian what’s-the-coun­ter­fac­tual think­ing im­mensely valuable and widely ap­pli­ca­ble. I think that it’s a teach­able skill that any­one can im­prove on with prac­tice, even if it comes eas­ier to some than to oth­ers.

Only pi­geons are born mas­ter Bayesi­ans.