Overconfident Pessimism

You can build a machine to draw [deductive] conclusions for you, but I think you can never build a machine that will draw [probabilistic] inferences.

George Polya, 34 years before Pearl (1988) launched the probabilistic revolution in AI

The energy produced by the breaking down of the atom is a very poor kind of thing. Anyone who expects a source of power from the transformation of these atoms is talking moonshine.

Ernest Rutherford in 1933, 18 years before the first nuclear reactor went online

I confess that in 1901 I said to my brother Orville that man would not fly for fifty years. Two years later we ourselves made flights. This demonstration of my impotence as a prophet gave me such a shock that ever since I have distrusted myself...

Wilbur Wright, in a 1908 speech

Startling insights are hard to predict.1 Polya and Rutherford couldn’t have predicted when computational probabilistic reasoning and nuclear power would arrive. Their training in scientific skepticism probably prevented them from making confident predictions about what would be developed in the next few decades.

What’s odd, then, is that their scientific skepticism didn’t prevent them from making confident predictions about what wouldn’t be developed in the next few decades.

I am blessed to occasionally chat with some of the smartest scientists in the world, especially in computer science. They generally don’t make confident predictions that certain specific, difficult, insight-based technologies will be developed soon. And yet, immediately after agreeing with me that “the future is very hard to predict,” they will confidently state that a specific, difficult technology is more than 50 years away!

Error. Does not compute.

What’s going on, here?

I don’t think it’s always a case of motivated skepticism. I don’t think think Wilbur Wright was motivated to think flight was a long way off. I think he was “zoomed in” on the difficulty of the problem, didn’t see a way to solve it, and misinterpreted his lack of knowledge about the difficulty of flight as positive information that flight was extremely difficult and far away.

As Eliezer wrote:

When heavier-than-air flight or atomic energy was a hundred years off, it looked fifty years off or impossible; when it was five years off, it still looked fifty years off or impossible. Poor information.

(Of course, we can predict some technological advances better than others: “Five years before the first moon landing, it looked a few years off but certainly not a hundred years off.”)

There may also be a psychological double standard for “positive” and “negative” predictions. Skepticism about confident positive predictions — say, that AI will be invented soon — feels like the virtuous doubt of standard scientific training. But oddly enough, making confident negative predictions — say, that AI will not be invented soon — also feels like virtuous doubt, merely because the first prediction was phrased positively and the second was phrase negatively.

There’s probably some Near-Far stuff going on, too. Nuclear fusion and AI feel abstract and unknown, and thus they also feel distant. But when you’re ignorant about a phenomenon, the correct response is to broaden your confidence intervals in both directions, not push them in one direction like the Near-Far effect wants you to.

The scientists I speak to are right to say that it’s very hard to predict the development of specific technologies. But one cannot “simultaneously claim to know little about the future and to be able to set strong lower bounds on technology development times,” on pain of contradiction.

Depending on the other predictions these scientists have made, they might be3 manifesting a form of overconfidence I’ll call “overconfident pessimism.” It’s well-known that humans are overconfident, but since overconfident pessimism seems to be less-discussed than overconfident optimism, I think it’s worth giving it its own name.

What can we do to combat overconfident pessimism in ourselves?

The most broadly useful debiasing technique is to “consider the opposite” (Larrick 2004):

The strategy consists of nothing more than asking oneself, “What are some reasons that my initial judgment might be wrong?” The strategy is effective because it directly counteracts the basic problem of association-based processes — an overly narrow sample of evidence – by expanding the sample and making it more representative...

Or, consider this variant of “consider the opposite”:

Typically, subjective range estimates exhibit high overconfidence. Ranges for which people are 80 percent confident capture the truth 30 percent to 40 percent of the time. Soll and Klayman (2004) showed that having judges generate 10th and 90th percentile estimates in separate stages – which forces them to consider distinct reasons for low and high values – increased hit rates to nearly 60 percent by both widening and centering ranges.2

Another standard method for reducing overconfidence and improving one’s accuracy in general is calibration training (Lichtenstein et al. 1982; Hubbard 2007).

The calibration training process is pretty straightforward: Write down your predictions, then check whether they came true. Be sure to also state your confidence in each prediction. If you’re perfectly calibrated, then predictions you made with 60% confidence should be correct 60% of the time, while predictions you made with 90% confidence should be correct 90% of the time.

You will not be perfectly calibrated. But you can become better-calibrated over time with many rounds of feedback. That’s why weather forecasters are so much more accurate than most other kinds of experts (Murphy & Winkler 1984): every week, they learn whether their predictions were correct. It’s harder to improve your calibration when you have to wait 5 or 30 years to see whether your predictions (say, about technological development) were correct, but calibration training in any domain seems to reduce overconfidence in general, since you get to viscerally experience how often you are wrong — even on phenomena that should be easier to predict than long-term technological development.

Perhaps the best online tool for calibration training is PredictionBook.com. For a story of one person becoming better calibrated using PredictionBook.com, see 1001 PredictionBook Nights. Another tool is the Calibration Game, available for Mac, Windows, iOS, and Android.

To counteract overconfident pessimism in particular, be sure to record lots of negative predictions, not just positive predictions.

Finally, it may help to read lists of failed negative predictions. Here you go: one, two, three, four, five, six.

Notes

1 Armstrong & Sotala (2012) helpfully distinguish “insight” and “grind”:

Project managers and various leaders are often quite good at estimating the length of projects… Publication dates for video games, for instance, though often over-optimistic, are generally not ridiculously erroneous – even though video games involve a lot of creative design, play-testing, art, programing the game “AI”, etc. . . Moore’s law could be taken as an ultimate example of grind: we expect the global efforts of many engineers across many fields to average out to a rather predictable exponential growth.

Predicting insight, on the other hand, seems a much more daunting task. Take the Riemann hypothesis, a well-established mathematical hypothesis from 1885. How would one go about estimating how long it would take to solve? How about the P = NP hypothesis in computing? Mathematicians seldom try and predict when major problems will be solved, because they recognise that insight is very hard to predict.

2 See also Speirs-Bridge et al. (2009).

3 The original version of this post incorrectly accused the scientists I’ve spoken with of overconfidence, but I can’t rightly draw that conclusion without knowing the outcomes of their other predictions.