Shut Up And Guess

Related to: Extreme Rationality: It’s Not That Great

A while back, I said provocatively that the rarefied sorts of rationality we study at Less Wrong hadn’t helped me in my everyday life and probably hadn’t helped you either. I got a lot of controversy but not a whole lot of good clear examples of getting some use out of rationality.

Today I can share one such example.

Consider a set of final examinations based around tests with the following characteristics:

* Each test has one hundred fifty true-or-false questions.
* The test is taken on a scan-tron which allows answers of “true”, “false”, and “don’t know”.
* Students get one point for each correct answer, zero points for each “don’t know”, and minus one half point for each incorrect answer.
* A score of >50% is “pass”, >60% is “honors”, >70% is “high honors”.
* The questions are correspondingly difficult, so that even a very intelligent student is not expected to get much above 70. All students are expected to encounter at least a few dozen questions which they can answer only with very low confidence, or which they can’t answer at all.

At what confidence level do you guess? At what confidence level do you answer “don’t know”?

I took several of these tests last month, and the first thing I did was some quick mental calculations. If I have zero knowledge of a question, my expected gain from answering is 50% probability of earning one point and 50% probability of losing one half point. Therefore, my expected gain from answering a question is .5(1)-.5(.5)= +.25 points. Compare this to an expected gain of zero from not answering the question at all. Therefore, I ought to guess on every question, even if I have zero knowledge. If I have some inkling, well, that’s even better.

You look disappointed. This isn’t a very exciting application of arcane Less Wrong knowledge. Anyone with basic math skills should be able to calculate that out, right?

I attend a pretty good university, and I’m in a postgraduate class where most of us have at least a bachelor’s degree in a hard science, and a few have master’s degrees. And yet, talking to my classmates in the cafeteria after the first test was finished, I started to realize I was the only person in the class who hadn’t answered “don’t know” to any questions.

I have several friends in the class who had helped me with difficult problems earlier in the year, so I figured the least I could do for them was to point out that they could get several free points on the exam by guessing instead of putting “don’t know”. I got a chance to talk to a few people between tests, and I explained the argument to them using exactly the calculation I gave above. My memory’s not perfect, but I think I tried it with about five friends.

Not one of them was convinced. I see that while I’ve been off studying and such, you’ve been talking about macros of absolute denial and such, and while I’m not sure I like the term, this almost felt like coming up against a macro of absolute denial.

I had people tell me there must be some flaw in my math. I had people tell me that math doesn’t always map to the real world. I had people tell me that no, I didn’t understand, they really didn’t have any idea of the answer to that one question. I had people tell me they were so baffled by the test that they expected to consistently get significantly more than fifty percent of the (true or false!) questions they guessed on wrong. I had people tell me that although yes, in on the average they would do better, there was always the possibility that by chance alone they would get all thirty of the questions they guessed on wrong and end up at a huge disadvantage1.

I didn’t change a single person’s mind. The next test, my friends answered just as many “don’t know”s as the last one.

This floored me, because it’s not one of those problems about politics or religion where people have little incentive to act rationally. These tests were the main component of the yearly grade in a very high-pressure course. My friend who put down thirty “don’t know”s could easily have increased his grade in the class 5% by listening to me, maybe even moved up a whole letter grade. Nope. Didn’t happen. So here’s my theory.

The basic mistake seems to be loss aversion2, the tendency to regret losses more than one values gains. This could be compounded by students’ tendency to discuss answers after the test: I remember each time I heard that one of my guesses had been wrong and I’d lost points, it was a deep psychic blow. No doubt my classmates tended to remember the guesses they’d gotten wrong more than the ones they’d gotten right, leading to the otherwise inexplicable statement that they expect to get more than half of their guesses wrong. But this mistake should disappear once the correct math is explained. Why doesn’t it?

In The Terrible...Truth About Morality, Roko gives a good example of the way our emotional and rational minds interact. A person starts with an emotion—in that case, a feeling of disgust about incest, and only later come up with some reason why that emotion is the objectively correct emotion to have and why their action of condemning the relationship is rationally justified.

My final exam, thanks to loss aversion, created an emotional inclination against guessing, which most of the students taking it followed. When confronted with an argument against it, my friends tried to come up with reasons why the course they took was logical—reasons which I found very unconvincing.

It’s really this last part which was so perfect I couldn’t resist posting about it. One of my close friends (let’s call him Larry) finally admitted, after much pestering on my part, that guessing would increase his score. But, he said, he still wasn’t going to guess, because he had a moral objection to doing so. Tests were supposed to measure how much we knew, not how lucky we were, and if he really didn’t know the answer, he wanted that ignorance to be reflected in his final score.

A few years ago, I would have respected that strong committment to principle. Today, jaded as I am, I waited until the last day of exams, when our test was a slightly different format. Instead of being true-false, it was multiple-choice: choose one of eight. And there was no penalty for guessing; indeed, there wasn’t even a “don’t know” on the answer sheet, although you could still leave it blank if you really wanted.

”So,” I asked Larry afterwards, “did you guess on any of the questions?”

″Yeah, there were quite a few I didn’t know,” he answered.

When I reminded him about his moral commitment, he said something about how this was different because there were more answers available so it wasn’t really the same as guessing on a fifty-fifty question. At the risk of impugning my friend’s subconscious motives, I think he no longer had to use moral ideals to rationalize away his fear of losing points, so he did the smart thing and guessed.

Footnotes

1: If I understand the math right, then if you guess on thirty questions using my test’s scoring rule, the probability of ending up with a net penalty from guessing is less than one percent [EDIT: Actually just over two percent, thank you ArthurB]. If, after finishing all the questions of which they were “certain”, a person felt confident that they were right over the cusp of a passing grade, assigned very high importance to passing, and assigned almost no importance to any increase in grade past the passing point, then it might be rational not to guess, to avoid the less than one percent chance of failure. In reality, no one could calculate their grade out this precisely.

2: Looking to see if anyone else had been thinking along the same lines3, I found a very interesting paper describing some work of Kahneman and Tversky on this issue, and proposing a scoring rule that takes loss aversion into account. Although I didn’t go through all of the math, the most interesting number in there seems to be that on a true/​false test that penalizes wrong answers at the same rate it rewards correct answers (unlike my test, which rewarded guessing), a person with the empirically determined level of human loss aversion will (if I understand the stats right) need to be ~79% sure before choosing to answer (as opposed to the utility maximizing level of >50%). This also linked me to prospect theory, which is interesting.

3: I’m surprised that test-preparation companies haven’t picked up on this. Training people to understand calibration and loss aversion could be very helpful on standardized tests like the SATs. I’ve never taken a Kaplan or Princeton Review course, but those who have tell me this topic isn’t covered. I’d be surprised if the people involved didn’t know the science, so maybe they just don’t know of a reliable way to teach such things?