Monty Hall in the Wild

Cross-posted from

I visited a friend yesterday, and after we finished our first bottle of Merlot I brought up the Monty Hall problem, as one does. My friend confessed that she has never heard of the problem. Here’s how it goes:

You are faced with a choice of three doors. Behind one of them is a shiny pile of utilons, while the other two are hiding avocados that are just starting to get rotten and maybe you can convince yourself that they’re still OK to eat but you’ll immediately regret it if you try. The original formulation talks about a car and two goats, which always confused me because goats are better for racing than cars.

Anyway, you point to Door A, at which point the host of the show (yeah, it’s a show, you’re on TV) opens one of the other doors to reveal an almost-rotten avocado. The host knows what hides behind each door, always offers the switch, and never opens the one with the prize because that would make for boring TV.

You now have the option of sticking with Door A or switching to the third door that wasn’t opened. Should you switch?

After my friend figured out the answer she commented that:

  1. The correct answer is obvious.

  2. The problem is pretty boring.

  3. Monty Hall isn’t relevant to anything you would ever come across in real life.

Wrong, wrong, and wrong.

According to Wikipedia, only 13% of people figure out that switching doors improves your chance of winning from 13 to 23. But even more interesting is the fact that an incredible number of people remain unconvinced even after seeing several proofs, simulations, and demonstrations. The wrong answer, that the doors have an equal 12 chance of avocado, is so intuitively appealing that even educated people cannot overcome that intuition with mathematical reason.

I’ve written a lot recently about decoupling. At the heart of decoupling is the ability to override an intuitive System 1 answer with a System 2 answer that’s arrived at by applying logic and rules. This ability to override is what’s measured by the Cognitive Reflection Test, the standard test of rationality. Remarkably, many people fail to improve on the CRT even after taking it multiple times.

When rationalists talk about “me” and “my brain”, the former refers to their System 2 and the latter to their System 1 and their unconscious. “I wanted to get some work done, but my braindecided it needs to scroll through Twitter for an hour.” But in almost any other group, “my brain” is System 2, and what people identify with is their intuition.

Rationalists often underestimate the sheer inability of many people to take this first step towards rationality, of realizing that the first answer that pops into their heads could be wrong on reflection. I’ve talked to people who were utterly confused by the suggestion that their gut feelings may not correspond perfectly to facts. For someone who puts a lot of weight on System 2, it’s remarkable to see people for whom it may as well not exist.

You know who does really well on the Monty Hall problem? Pigeons. At least, pigeons quickly learn to switch doors when the game is repeated multiple times over 30 days and they can observe that switching doors is twice as likely to yield the prize.

This isn’t some bias in favor of switching either, because when the condition is reversed so the prize is made to be twice as likely to appear behind the door that was originally chosen, the pigeons update against switching just as quickly:

Remarkably, humans don’t update at all on the iterated game. When switching is better, a third of people refuse to switch no matter how long the game is repeated:

And when switching is worse, a third of humans keep switching:

I first saw this chart when my wife was giving a talk on pigeon cognition (I really married well). I immediately became curious about the exact sort of human who can yield a chart like that. The study these are taken from is titled Are Birds Smarter than Mathematicians?, but the respondents were not mathematicians at all.

Failing to improve one iota in the repeated game requires a particular sort of dysrationalia, where you’re so certain of your mathematical intuition that mounting evidence to the contrary only causes you to double down on your wrongness. Other studies have shown that people who don’t know math at all, like little kids, quickly update towards switching. The reluctance to switch can only come from an extreme Dunning-Kruger effect. These are people whose inability to do math is matched only by their certainty in their own mathematical skill.

This little screenshot tells you everything you need to know about academic psychology:

  1. The study had a sample size of 13, but I bet that the grant proposal still claimed that the power was 80%.

  2. The title misleadingly talked about mathematicians, even though not a single mathematician was harmed when performing the experiment.

  3. A big chunk of the psychology undergrads could neither figure out the optimal strategy nor learn it from dozens of repeated games..

Monty Hall is a cornerstone puzzle for rationality not just because it has an intuitive answer that’s wrong, but also because it demonstrates a key of Bayesian thinking: the importance of counterfactuals. Bayes’ law says that you must update not only on what actually happened, but also on what could have happened.

There are no counterfactuals relevant to Door A, because it never could have been opened. In the world where Door A contains the prize nothing happens to it that doesn’t happen in the world where it smells of avocado. That’s why observing that the host opens Door B doesn’t change the probability of Door A being a winner: it stays at 13.

But for Door C, you knew that there was a chance it would be opened, but it wasn’t. And the chance of Door C being opened depends on what it contains: 0% of being opened if it hides the prize, 75% if it’s the avocado. Even before calculating the numbers exactly, this tells you that observing the door that is opened should make you update the odds of Door C.

Contra my friend, Monty Hall-esque logic shows up in a lot of places if you know how to notice it.

Here’s an intuitive example: you enter a yearly performance review with your curmudgeonly boss who loves to point out your faults. She complains that you always submit your expense reports late, which annoys the HR team. Is this good or bad news?

It’s excellent news! Your boss is a lot more likely to complain about some minor detail if you’re doing great on everything else, like actually getting the work done with your team. The fact that she showed you a stinking avocado behind the “expense report timeliness” door means that the other doors are likely praiseworthy.

Another example – there are 9 French restaurants in town: Chez Arnaud, Chez Bernard, Chez Claude etc. You ask your friend who has tried them all which one is the best, and he informs you that it’s Chez Jacob. What does this mean for the other restaurants?

It means that the other eight restaurants are slightly worse than you initially thought. The fact that your friend could’ve picked any of them as the best but didn’t is evidence against them being great.

To put some numbers on it, your initial model could be that the nine restaurants are equally likely to occupy any percentile of quality among French restaurants in the country. If percentiles are confusing, imagine that any restaurant is equally likely to get any score of quality between 0 and 100. You’d expect any of the nine to be the 50th percentile restaurant, or to score 50100, on average. Learning that Chez Jacob is the best means that you should expect it to score 90, since the maximum of N independent variables distributed uniformly over [0,1] has an expected value of N/​N+1. You should update that the average of the eight remaining restaurants scores 45.

The shortcut to figuring that out is considering some probability or quantity that is conserved after you made your observation. In Monty Hall, the quantity that is conserved is the 23 chance that the prize is behind one of the doors B or C (because the chance of Door A is 13 and is independent of the observation that another door is opened). After Door B is opened, the chance of it containing the prize goes down by 13, from 13 to 0. This means that the chance of Door C should increase by 13, from 13 to 23.

Similarly, being told that Chez Jacob is the best upgraded it from 50 to 90, a jump of 40 percentiles or points. But your expectation that the average of all nine restaurants in town is 50 shouldn’t change, assuming that your friend was going to pick out one best restaurant regardless of the overall quality. Since Chez Jacob gained 40 points, the other eight restaurants have to lose 5 points each to keep the average the same. Thus, your posterior expectation for them went down from 50 to 45.

Here’s one more example of counterfactual-based thinking. It’s not parallel to Monty Hall at first glance, but the same Bayesian logic underpins both of them.

I’ve spoken to two women who write interesting things but are hesitant to post them online because people tell them they suck and call them $&#@s. They asked me how I deal with being called a $&#@ online. My answer is that I realized that most of the time when someone calls me a $&#@ online I should update not that I may be a $&#@, but only that I’m getting more famous.

There are two kinds of people who can possibly respond to something I’ve written by telling me I’m a $&#@:

  1. People who read all my stuff and are usually positive, but in this case thought I was a $&#@.

  2. People who go around calling others $&#@s online, and in this case just happened to click on Putanumonit.

In the first case, the counterfactual to being called a $&#@ is getting a compliment, which should make think that I $&#@ed up with that particular essay. When I get negative comments from regular readers, I update.

In the second case, the counterfactual to me being called a $&#@ is simply someone else being called a $&#@. My update is simply that my essay has gotten widely shared, which is great.

For example, here’s a comment that starts off with “Fuck you Aspie scum”. The comment has nothing to do with my actual post, I’m pretty sure that its author just copy-pastes it on rationalist blogs that he happens to come across. Given this, I found the comment to be a positive sign of the expanding reach of Putanumonit. It did nothing to make me worry that I’m an Aspie scum who should be fucked.

I find this sort of Bayesian what’s-the-counterfactual thinking immensely valuable and widely applicable. I think that it’s a teachable skill that anyone can improve on with practice, even if it comes easier to some than to others.

Only pigeons are born master Bayesians.