The Track Record of Futurists Seems … Fine

HoldenKarnofsky30 Jun 2022 19:40 UTC

105 points

Forecasting & Prediction World Modeling Futurism

I’ve argued that the development of advanced AI could make this the most important century for humanity. A common reaction to this idea is one laid out by Tyler Cowen here: “how good were past thinkers at predicting the future? Don’t just select on those who are famous because they got some big things right.”

This is a common reason people give for being skeptical about the most important century—and, often, for skepticism about pretty much any attempt at futurism (trying to predict key events in the world a long time from now) or steering (trying to help the world navigate such key future events).

The idea is something like: “Even if we can’t identify a particular weakness in arguments about key future events, perhaps we should be skeptical of our own ability to say anything meaningful at all about the long-run future. Hence, perhaps we should forget about theories of the future and focus on reducing suffering today, generally increasing humanity’s capabilities, etc.”

But are people generally bad at predicting future events? Including thoughtful people who are trying reasonably hard to be right? If we look back at prominent futurists’ predictions, what’s the actual track record? How bad is the situation?

I’ve looked pretty far and wide for systematic answers to this question, and Open Philanthropy’s¹ Luke Muehlhauser has put a fair amount of effort into researching it; I discuss what we’ve found in an appendix. So far, we haven’t turned up a whole lot—the main observation is that it’s hard to judge the track record of futurists. (Luke discusses the difficulties here.)

Recently, I worked with Gavin Leech and Misha Yagudin at Arb Research to take another crack at this. I tried to keep things simpler than with past attempts—to look at a few past futurists who (a) had predicted things “kind of like” advances in AI (rather than e.g. predicting trends in world population); (b) probably were reasonably thoughtful about it; but (c) are very clearly not “just selected on those who are famous because they got things right.” So, I asked Arb to look at predictions made by the “Big Three” science fiction writers of the mid-20th century: Isaac Asimov, Arthur C. Clarke, and Robert Heinlein.

These are people who thought a lot about science and the future, and made lots of predictions about future technologies—but they’re famous for how entertaining their fiction was at the time, not how good their nonfiction predictions look in hindsight. I selected them by vaguely remembering that “the Big Three of science fiction” is a thing people say sometimes, googling it, and going with who came up—no hunting around for lots of sci-fi authors and picking the best or worst.²

So I think their track record should give us a decent sense for “what to expect from people who are not professional, specialized or notably lucky forecasters but are just giving it a reasonably thoughtful try.” As I’ll discuss below, I think this is many ways “unfair” as a comparison to today’s forecasts about AI: I think these predictions are much less serious, less carefully considered and involve less work (especially work weighing different people and arguments against each other).

But my takeaway is that their track record looks … fine! They made lots of pretty detailed, nonobvious-seeming predictions about the long-run future (30+, often 50+ years out); results ranged from “very impressive” (Asimov got about half of his right, with very nonobvious-seeming predictions) to “bad” (Heinlein was closer to 35%, and his hits don’t seem very good) to “somewhere in between” (Clarke had a similar hit rate to Asimov, but his correct predictions don’t seem as impressive). There are a number of seemingly impressive predictions and seemingly embarrassing ones.

(How do we determine what level of accuracy would be “fine” vs. “bad?” Unfortunately there’s no clear quantitative benchmark—I think we just have to look at the predictions ourselves, how hard they seemed / how similar to today’s predictions about AI, and make a judgment call. I could easily imagine others having a different interpretation than mine, which is why I give examples and link to the full prediction sets. I talk about this a bit more below.)

They weren’t infallible oracles, but they weren’t blindly casting about either. (Well, maybe Heinlein was.) Collectively, I think you could call them “mediocre,” but you can’t call them “hopeless” or “clueless” or “a warning sign to all who dare predict the long-run future.” Overall, I think they did about as well as you might naively³ guess a reasonably thoughtful person would do at some random thing they tried to do?

Below, I’ll:

Summarize the track records of Asimov, Clarke and Heinlein, while linking to Arb’s full report.
Comment on why I think key predictions about transformative AI are probably better bets than the Asimov/Clarke/Heinlein predictions—although ultimately, if they’re merely “equally good bets,” I think that’s enough to support my case that we should be paying a lot more attention to the “most important century” hypothesis.
Summarize other existing research on the track record of futurists, which I think is broadly consistent with this take (though mostly ambiguous).

For this investigation, Arb very quickly (in about 8 weeks) dug through many old sources, used pattern-matching and manual effort to find predictions, and worked with contractors to score the hundreds of predictions they found. Big thanks to them! Their full report is here. Note this bit: “If you spot something off, we’ll pay $5 per cell we update as a result. We’ll add all criticisms – where we agree and update or reject it – to this document for transparency.”

The track records of the “Big Three”

Quick summary of how Arb created the data set

Arb collected “digital copies of as much of their [Asimov’s, Clarke’s, Heinlein’s] nonfiction as possible (books, essays, interviews). The resulting intake is 475 files covering ~33% of their nonfiction corpuses.”

Arb then used pattern-matching and manual inspection to pull out all of the predictions it could find, and scored these predictions by:

How many years away the prediction appeared to be. (Most did not have clear dates attached; in these cases Arb generally filled the average time horizon for predictions from the same author that did have clear dates attached.)
Whether the prediction now appears correct, incorrect, or ambiguous. (I didn’t always agree with these scorings, but I generally have felt that “correct” predictions at least look “impressive and not silly” while “incorrect” predictions at least look “dicey.”)
Whether the prediction was a pure prediction about what technology could do (most relevant), a prediction about the interaction of technology and the economy (medium), or a prediction about the interaction of technology and culture (least relevant). Predictions with no bearing on technology were dropped.
How “difficult” the prediction was (that is, how much the scorers guessed it diverged from conventional wisdom or “the obvious” at the time—details in footnote⁴).

Importantly, fiction was never used as a source of predictions, so this exercise is explicitly scoring people on what they were not famous for. This is more like an assessment of “whether people who like thinking about the future make good predictions” than an assessment of “whether professional or specialized forecasters make good predictions.”

For reasons I touch on in an appendix below, I didn’t ask Arb to try to identify how confident the Big Three were about their predictions. I’m more interested in whether their predictions were nonobvious and sometimes correct than in whether they were self-aware about their own uncertainty; I see these as different issues, and I suspect that past norms discouraged the latter more than today’s norms do (at least within communities interested in Bayesian mindset and the science of forecasting).

More detail in Arb’s report.

The numbers

The tables below summarize the numbers I think give the best high-level picture. See the full report and detailed files for the raw predictions and a number of other cuts; there are a lot of ways you can slice the data, but I don’t think it changes the picture from what I give below.

Below, I present each predictor’s track record on:

“All predictions”: all resolved predictions 30 years out or more,⁵ including predictions where Arb had to fill in a time horizon.
“Tech predictions”: like the above, but restricted to predictions specifically about technological capabilities (as opposed to technology/economy interactions or technology/culture interactions.
“Difficult predictions” predictions with “difficulty” of ⁴⁄₅ or ⁵⁄₅.
“Difficult + tech + definite date”: the small set of predictions that met the strictest criteria (tech only, “hardness” ⁴⁄₅ or ⁵⁄₅, definite date attached).

Asimov

You can see the full set of predictions here, but to give a flavor, here are two “correct” and two “incorrect” predictions from the strictest category.⁶ All of these are predictions Asimov made in 1964, about the year 2014 (unless otherwise indicated).

Correct: “only unmanned ships will have landed on Mars, though a manned expedition will be in the works.” Bingo, and impressive IMO.
Correct: “the screen [of a phone] can be used not only to see the people you call but also for studying documents and photographs and reading passages from books.” I feel like this would’ve been an impressive prediction in 2004.
Incorrect: “there will be increasing emphasis on transportation that makes the least possible contact with the surface. There will be aircraft, of course, but even ground travel will increasingly take to the air a foot or two off the ground.” So false that we now refer to things that don’t hover as “hoverboards.”
Incorrect: “transparent cubes will be making their appearance in which three-dimensional viewing will be possible. In fact, one popular exhibit at the 2014 World’s Fair will be such a 3-D TV, built life-size, in which ballet performances will be seen. The cube will slowly revolve for viewing from all angles.” Doesn’t seem ridiculous, but doesn’t seem right. Of course, a side point here is that he refers to the 2014 World’s Fair, which didn’t happen.

A general challenge with assessing prediction track records is that we don’t know what to compare someone’s track record to. Is getting about half your predictions right “good,” or is it no more impressive than writing down a bunch of things that might happen and flipping a coin on each?

I think this comes down to how difficult the predictions are, which is hard to assess systematically. A nice thing about this study is that there are enough predictions to get a decent sample size, but the whole thing is contained enough that you can get a good qualitative feel for the predictions themselves. (This is why I give examples; you can also view all predictions for a given person by clicking on their name above the table.) In this case, I think Asimov tends to make nonobvious, detailed predictions, such that I consider it impressive to have gotten ~half of them to be right.

Clarke

Examples (as above):⁷

Correct 1964 prediction about 2000: “[Communications satellites] will make possible a world in which we can make instant contact with each other wherever we may be. Where we can contact our friends anywhere on Earth, even if we don’t know their actual physical location. It will be possible in that age, perhaps only fifty years from now, for a [person] to conduct [their] business from Tahiti or Bali just as well as [they] could from London.” (I assume that “conduct [their] business” refers to a business call rather than some sort of holistic claim that no productivity would be lost from remote work.)
Correct 1950 prediction about 2000: “Indeed, it may be assumed as fairly certain that the first reconnaissances of the planets will be by orbiting rockets which do not attempt a landing-perhaps expendable, unmanned machines with elaborate telemetering and television equipment.” This doesn’t seem like a super-bold prediction; a lot of his correct predictions have a general flavor of saying progress won’t be too exciting, and I find these less impressive than most of Asimov’s correct predictions.
Incorrect 1960 prediction about 2010: “One can imagine, perhaps before the end of this century, huge general-purpose factories using cheap power from thermonuclear reactors to extract pure water, salt, magnesium, bromine, strontium, rubidium, copper and many other metals from the sea. A notable exception from the list would be iron, which is far rarer in the oceans than under the continents.”
Incorrect 1949 prediction about 1983: “Before this story is twice its present age, we will have robot explorers dotted all over Mars.”

I generally found this data set less satisfying/educational than Asimov’s: a lot of the predictions were pretty deep in the weeds of how rocketry might work or something, and a lot of them seemed pretty hard to interpret/score. I thought the bad predictions were pretty bad, and the good predictions were sometimes good but generally less impressive than Asimov’s.

Heinlein

This seems really bad, especially adjusted for difficulty: many of the “correct” ones seem either hard-to-interpret or just very obvious (e.g., no time travel). I was impressed by his prediction that “we probably will still be after a cure for the common cold” until I saw a prediction in a separate source saying “Cancer, the common cold, and tooth decay will all be conquered.” Overall it seems like he did a lot of predicting outlandish stuff about space travel, and then anti-predicting things that are probably just impossible (e.g., no time travel).

He did have some decent ones, though, such as: “By 2000 A.D. we will know a great deal about how the brain functions … whereas in 1900 what little we knew was wrong. I do not predict that the basic mystery of psychology—how mass arranged in certain complex patterns becomes aware of itself—will be solved by 2000 A.D. I hope so but do not expect it.” He also predicted no human extinction and no end to war—I’d guess a lot of people disagreed with these at the time.

Overall picture

Looks like, of the “big three,” we have:

One (Asimov) who looks quite impressive—plenty of misses, but a 50% hit rate on such nonobvious predictions seems pretty great.
One (Heinlein) who looks pretty unserious and inaccurate.
One (Clarke) who’s a bit hard to judge but seems pretty solid overall (around half of his predictions look to be right, and they tend to be pretty nonobvious).

Today’s futurism vs. these predictions

The above collect casual predictions—no probabilities given, little-to-no reasoning given, no apparent attempt to collect evidence and weigh arguments—by professional fiction writers.

Contrast this situation with my summary of the different lines of reasoning forecasting transformative AI. The latter includes:

Systematic surveys aggregating opinions from hundreds of AI researchers.
Reports that Open Philanthropy employees spent thousands of hours on, systematically presenting evidence and considering arguments and counterarguments.
A serious attempt to take advantage of the nascent literature on how to make good predictions; e.g., the authors (and I) have generally done calibration training,⁸ and have tried to use the language of probability to be specific about our uncertainty.

There’s plenty of room for debate on how much these measures should be expected to improve our foresight, compared to what the “Big Three” were doing. My guess is that we should take forecasts about transformative AI a lot more seriously, partly because I think there’s a big difference between putting in “extremely little effort” (basically guessing off the cuff without serious time examining arguments and counter-arguments, which is my impression of what the Big Three were mostly doing) and “putting in moderate effort” (considering expert opinion, surveying arguments and counter-arguments, explicitly thinking about one’s degree of uncertainty).

But the “extremely little effort” version doesn’t really look that bad.

If you look at forecasts about transformative AI and think “Maybe these are Asimov-ish predictions that have about a 50% hit rate on hard questions; maybe these are Heinlein-ish predictions that are basically crap,” that still seems good enough to take the “most important century” hypothesis seriously.

Appendix: other studies of the track record of futurism

A 2013 project assessed Ray Kurzweil’s 1999 predictions about 2009, and a 2020 followup assessed his 1999 predictions about 2019. Kurzweil is known for being interesting at the time rather than being right with hindsight, and a large number of predictions were found and scored, so I consider this study to have similar advantages to the above study.

The first set of predictions (about 2009, 10-year horizon) had about as many “true or weakly true” predictions as “false or weakly false” predictions.
The second (about 2019, 20-year horizon) was much worse, with 52% of predictions flatly “false,” and “false or weakly false” predictions outnumbering “true or weakly true” predictions by almost 3-to-1.

Kurzweil is notorious for his very bold and contrarian predictions, and I’m overall inclined to call his track record something between “mediocre” and “fine”—too aggressive overall, but with some notable hits. (I think if the most important century hypothesis ends up true, he’ll broadly look pretty prescient, just on the early side; if it doesn’t, he’ll broadly look quite off base. But that’s TBD.)

A 2002 paper, summarized by Luke Muehlhauser here, assessed the track record of The Year 2000 by Herman Kahn and Anthony Wiener, “one of the most famous and respected products of professional futurism.”

About 45% of the forecasts were judged as accurate.
Luke concludes that Kahn and Wiener were grossly overconfident, because he interprets them as making predictions with 90-95% confidence.
My takeaway is a bit different. I see a recurring theme that people often get 40-50% hit rates on interesting predictions about the future, but sometimes present these predictions with great confidence (which makes them look foolish).
I think we can separate “Past forecasters were overconfident” (which I suspect is partly due to clear expression and quantification of uncertainty being uncommon and/or discouraged in relevant contexts) from “Past forecasters weren’t able to make interesting predictions that were reasonably likely to be right.” The former seems true to me, but the latter doesn’t.

Luke’s 2019 survey on the track record of futurism identifies two other relevant papers (here and here); I haven’t read these beyond the abstracts, but their overall accuracy rates were 76% and 37%, respectively. It’s difficult to interpret those numbers without having a feel for how challenging the predictions were.

A 2021 EA Forum post looks at the aggregate track record of forecasters on PredictionBook and Metaculus, including specific analysis of forecasts 5+ years out, though I don’t find it easy to draw conclusions about whether the performance was “good” or “bad” (or how similar the questions were to the ones I care about).

Comment/discuss

Footnotes

Disclosure: I’m co-CEO of Open Philanthropy.
↩
I also briefly Googled for their predictions to get a preliminary sense of whether they were the kinds of predictions that seemed relevant. I found a couple of articles listing a few examples of good and bad predictions, but nothing systematic. I claim I haven’t done a similar exercise with anyone else and thrown it out. ↩
That is, if we didn’t have a lot of memes in the background about how hard it is to predict the future. ↩
1 - was already generally known

2 - was expert consensus

3 - speculative but on trend

4 - above trend, or oddly detailed

5 - prescient, no trend to go off ↩
Very few predictions in the data set are for less than 30 years, and I just ignored them.
↩
Asimov actually only had one incorrect prediction in this category, so for the 2nd incorrect prediction I used one with difficulty “3” instead of “4.” ↩
The first prediction in this list qualified for the strictest criteria when I first drafted this post, but it’s now been rescored to difficulty=3/5, which I disagree with (I think it is an impressive prediction, more so than any of the remaining ones that qualify as difficulty=4/5). ↩
Also see this report on calibration for Open Philanthropy grant investigators (though this is a different set of people from the people who researched transformative AI timelines). ↩

What links here?

Dan Luu on Futurist Predictions by RobertM (14 Sep 2022 3:01 UTC; 51 points)

HoldenKarnofsky30 Jun 2022 19:40 UTC

105 points

25 comments12 min readLW link

Forecasting & Prediction World Modeling Futurism

simon 1 Jul 2022 6:53 UTC
24 points
2
There’s a lot of room for debate on the correctness of the resolutions of these predictions:
e.g. Heinlein in 1949:
Space travel we will have, not fifty years from now, but much sooner. It’s breathing down our necks.
This is marked as incorrect, due to the marker assuming that this meant mass space travel, but I wouldn’t interpret this as mass space travel unless there’s some relevant context I’m missing here—keep in mind that this was from 1949, 8 years before Sputnik.^[1]
On the other hand:
All aircraft will be controlled by a giant radar net run on a continent-wide basis by a multiple electronic “brain.”
This is marked as correct, apparently due to autopilot and the “USAF Airborne Command Post”? But I would interpret it as active control of the planes by a centralized computer and mark it as incorrect.^[2]
Edited to add: there were a bunch i could have mentioned but want to remark on this one where my interpretation was especially different from the marker’s:
Interplanetary travel is waiting at your front door — C.O.D. It’s yours when you pay for it.

This is also from 1949. The marker interprets this as a prediction of “Commercial interplanetary travel”. I see it rather as a conditional prediction of interplanetary travel (not necessarily commercial), given the willingness to fund it, i.e. a prediction that the necessary technology would be available but not necessarily that it would be funded. If this is the right interpretation, it seems correct to me. Again, I could be completely wrong depending on the context. ^[3]
1. ^
  Edited to add: I realized I actually have a copy of Heinlein’s “Expanded Universe” which includes “Where To?” and followup 1965 and 1980 comments. In context, this statement comes right in the middle of a discussion of hospitals for old people on the moon, which considerably shifts the interpretation towards it being intended to refer to mass space travel, though if Heinlein were still here he could argue it literally meant any space travel.
2. ^
  In context, it’s not 100% clear that he meant a single computer, though I still think so. But he definitely meant full automation outside of emergency or unusual situations; from his 1980 followup: “But that totally automated traffic control system ought to be built. … all routine (99.9%+ )takeoffs and landings should be made by computer.”
3. ^
  And now seeing the context, I stand by this interpretation: It’s a standalone comment from the original, but Heinlein’s 1965 followup includes “and now we are paying for it and the cost is high”, confirming that government space travel counted in his view...but, given that he did assert we were paying for it, and interplanetary space travel has not occurred (I interpret the prediction as meaning human space travel), this actually might cut against counting this as a correct prediction.
- technicalities 1 Jul 2022 18:24 UTC
  12 points
  6
  Parent
  Data collector here. Strongly agree with your general point: most of these entries are extremely far from modern “clairvoyant” (cleanly resolving) forecasting questions.
  
  Space travel. Disagree. In context he means mass space travel. The relevant lead-up is this:
  “According to her, the Moon is a great place and she wants us to come visit her.”
  “Not likely!” his wife answers. “Imagine being shut up in an air—conditioned cave.”
  “When you are Aunt Jane’s age, my honey lamb, and as frail as she is, with a bad heart thrown in, you’ll go to the Moon and like it.”
  Re: footnote 1. He was a dishonest bugger in his old age so I don’t doubt he would argue that.
  
  Central piloting. Yep, you’re right. We caught this before, but changed it in the wrong branch of the data. Going to make it ‘ambiguous’; let me know if that seems wrong.
  
  Commercial interplanetary travel. Disagree—“C.O.D.” is an old-timey word meaning something so normal and cheap that you don’t even need to pay for your ticket upfront—which implies that “you” is a consumer, not a government. (But again I see what you’re saying.)
  
  DM me for your bounty ($10)! I’ve linked to your comment in the changelog. Thanks!
  - simon 1 Jul 2022 19:59 UTC
    3 points
    0
    Parent
    Central piloting. Yep, you’re right. We caught this before, but changed it in the wrong branch of the data. Going to make it ‘ambiguous’; let me know if that seems wrong.
    I would call it a full miss myself.
    I still strongly disagree on the commercial interplanetary travel meaning.
    If “Cash on Delivery” has that old-timey meaning, it could push a bit to your interpretation, but not enough IMO.
    My reasoning:
    Interplanetary travel is waiting at your front door —
    Actual interplanetary travel, or say a trip on a spaceship, cannot literally be waiting at your front door. So clearly, a metaphorical meaning is intended.
    C.O.D. It’s yours when you pay for it.
    Here he extends the metaphor.
    But, in your view, that means it’s cheap. I disagree, if it was cheap he wouldn’t need to say “It’s yours when you pay for it”. Everything has to be paid for. If he meant it was cheap, he would just stop at C.O.D. and not say “It’s yours when you pay for it.”
    IMO, the “It’s yours when you pay for it” clearly means that he expected it to cost enough that it would be a significant barrier to progress (and the prediction is that it is in effect the only barrier to interplanetary travel). I do suspect though that he did intend the reader to pick up your connotation first, for the shock value, and the “It’s yours when you pay for it” is intended to shift the reader to the correct interpretation of what he means by C.O.D, i.e., it’s meant to be taken literally within the metaphorical context (and by Gricean implicature a large cost is meant) and not as an additional layer of metaphor.
    I suppose the 1965 comments could have been written to retroactively support an interpretation that would make the prediction correct, but I would bet most 1950 readers would have interpreted it as I did.
    Also, I note that John C. Wright agrees with my interpretation (in your link to support Heinlein being a “dishonest bugger”) (I didn’t notice anything in that link about him being a dishonest bugger, though—could you elaborate?). Wright also agrees with me on the central piloting prediction; looking briefly through Wright’s comments I didn’t see any interpretation of Wright’s that I disagreed with (I might quibble with some of Wright’s scoring, though probably mostly agree with that too). Unfortunately Wright doesn’t comment on whether he thinks Heinlein meant mass space travel as that was a side comment in the lunar retirement discussion and not presented specifically as a separated prediction in Heinlein’s original text.
Ben Pace 30 Jun 2022 21:46 UTC
21 points
24
Correct: “the screen [of a phone] can be used not only to see the people you call but also for studying documents and photographs and reading passages from books.” I feel like this would’ve been an impressive prediction in 2004.
This is an excellent prediction for 1964, and I respect Asimov a great deal for this.
johnswentworth 30 Jun 2022 23:13 UTC
20 points
0
Contrast this situation with my summary of the different lines of reasoning forecasting transformative AI. The latter includes:
- Systematic surveys aggregating opinions from hundreds of AI researchers.
- Reports that Open Philanthropy employees spent thousands of hours on, systematically presenting evidence and considering arguments and counterarguments.
- A serious attempt to take advantage of the nascent literature on how to make good predictions; e.g., the authors (and I) have generally done calibration training,⁸ and have tried to use the language of probability to be specific about our uncertainty.
There’s plenty of room for debate on how much these measures should be expected to improve our foresight, compared to what the “Big Three” were doing.
My guess would be these measures result in predictions somewhat worse than the Big Three. If you want a reference class for “more serious” forecasting, I’d say go look for forecasts by fancy consulting agencies or thinktanks. My guess would be that they do somewhat worse, mainly because their authors are optimizing to Look Respectable rather than just optimizing purely for accuracy. And the AI researcher surveys and OpenPhil reports also sure do look like they’re optimizing a significant amount for Looking Respectable.
- technicalities 1 Jul 2022 17:10 UTC
  6 points
  0
  Parent
  Is the point that 1) AGI specifically is too weird for normal forecasting to work, or 2) that you don’t trust judgmental forecasting in general, or 3) that respectability bias swamps the gains from aggregating a heavily selected crowd, spending more time, and debiasing in other ways?
  The OpenPhil longtermists’ respectability bias seems fairly small to me; their weirder stuff is comparable to Asimov (but not Clarke, who wrote a whole book about cryptids).
  And against this, you have to factor in the Big Three’s huge bias towards being entertaining instead of accurate (as well as e.g. Heinlein’s inability to admit error).
  Can you point at examples? (Bio anchors?)
  - johnswentworth 1 Jul 2022 18:34 UTC
    7 points
    −1
    Parent
    Is the point that 1) AGI specifically is too weird for normal forecasting to work, or 2) that you don’t trust judgmental forecasting in general, or 3) that respectability bias swamps the gains from aggregating a heavily selected crowd, spending more time, and debiasing in other ways?
    The third: respectability bias easily swamps the gains. (I’m not going to try to argue that case here, just give a couple examples of what such tradeoffs look like.)
    This is much more about the style of analysis/reasoning than about the topics; OpenPhil is certainly willing to explore weird topics.
    As an example, let’s look at the nanotech risk project you linked to. The very first thing in that write-up is:
    According to the definition set by the U.S. National Nanotechnology Initiative:
    Nanotechnology is...
    So right at the very beginning, we’re giving an explicit definition. That’s almost always an epistemically bad move. It makes the reasoning about “nanotech” seem more legible, but in actual fact the reasoning in the write-up was based on an intuitive notion of “nanotech”, not on this supposed definition. If the author actually wanted to rely on this definition, and not drag in intuitions about nanotech which don’t follow from the supposed definition, then the obvious thing to do would be to make up a new word—like “flgurgle”—and give “flgurgle” the definition. And then the whole report could talk about risks from flgurgle, and not have to worry about accidentally dragging in unjustified intuitions about “nanotech”.
    … of course that would be dumb, and not actually result in a good report, because using explicit definitions is usually a bad idea. Explicit definitions just don’t match the way the human brain actually uses words.
    But a definition does sound very Official and Respectable and Defendable. It’s even from an Official Government Source. Starting with a definition is a fine example of making a report more Respectable in a way which makes its epistemics worse.
    (The actual thing one should usually do instead of give an explicit definition is say “we’re trying to point to a vague cluster of stuff like <list of examples>”. And, in fairness, the definition used for nanotech in the report does do that to some extent; it does actually do a decent job avoiding the standard pitfalls of “definitions”. But the US National Nanotechnology Initiative’s definition is still, presumably, optimized more for academic politics than for accurately conveying the intuitive notion of “nanotech”.)
    The explanation of “Atomically Precise Manufacturing” two sections later is better, though it’s mostly just summarizing Drexler.
    Fast forward to the section on “Will it eventually be possible to develop APM?”. Most of the space in this section is spent summarizing two reports:
    The feasibility of atomically precise manufacturing has been reviewed in a report published by the US National Academy of Sciences (NAS). The NAS report was initiated in response to a Congressional request, and the result was included in the first triennial review of the U.S. National Nanotechnology Initiative. [...]
    and
    A Royal Society report was dismissive of the feasibility of ‘molecular manufacturing,’ [...]
    Ok, so here we have two reports which absolutely scream “academic politics” and are very obviously optimized for Respectability (Congressional request! Triennial review! Institutional acronyms (IA)!) rather than accuracy. Given that the author of the OpenPhil piece went looking for stuff like that, we can make some inferences about the relative prioritization of Respectability and accuracy for the person writing this report.
    So that’s two examples of Respectability/accuracy tradeoff (definitions and looking for Official Institutional Reports).
Chris-Lons 1 Jul 2022 12:04 UTC
11 points
10
Thanks for another thought provoking post. This is quite timely for me, as I’ve been thinking a lot about the difference between the work of futurists as compared to forecasters.
These are people who thought a lot about science and the future, and made lots of predictions about future technologies—but they’re famous for how entertaining their fiction was at the time, not how good their nonfiction predictions look in hindsight. I selected them by vaguely remembering that “the Big Three of science fiction” is a thing people say sometimes, googling it, and going with who came up—no hunting around for lots of sci-fi authors and picking the best or worst.
I think this is a clever way to try to avoid hindsight bias in selecting your futurists, but I think it’s at least plausible that only reasonably good futurists could rise to the status of “the Big Three of science fiction”. I’m assuming that the status is granted only several decades after the main corpus has been written and that reasonably good predictions (within the fiction) would help enormously in attaining it. On the other hand, imagine writers whose fiction became increasingly ridiculous as the future progressed because they did not make good predictions.^[1] Surely it would be very difficult for such authors to become part of the science fiction elite.
I’m not at all certain of this argument and would like to understand more about how cultural works move the “popular at release” to “classic” status.
At any rate, I think we should be at least moderately concerned that there could still be significant selection bias in the group being analyzed.
1. ^
  For example, I would put C.S. Lewis’ space trilogy in this category. They were good books and a forceful argument against the worst sorts of consequentialism, but imo, they were not great science fiction. Primarily because the way he imagines space and life on other planets seems completely ridiculous now.
Liam Donovan 30 Jun 2022 21:42 UTC
10 points
2
Minor curiosity: What was the context behind Asimov predicting in 1990 that permanent space cities would be built within 10 years? It seems like a much wilder leap than any of his other predictions.
- technicalities 1 Jul 2022 19:06 UTC
  5 points
  1
  Parent
  Good catch! The book is generally written as the history of the world leading up to 2000, and most of its predictions are about that year. But this is clearly an exception and the section offers nothing more precise than “By the year 3000, then, it may well be that Earth will be only a small part of the human realm.” I’ve moved it to the “nonresolved” tab.
  DM me for your bounty ($10)! I added your comment to the changelog. Thanks!
Bezzi 1 Jul 2022 9:59 UTC
8 points
1
Asimov may not have been a professional forecaster, but he was still someone who had thought a lot about the future in the most realistic way possible (and he got invited quite often on TV to talk about it, if I remember correctly), especially considering that he wrote also a crazy amount of scientific nonfiction. Maybe he’s more famous as a science fiction author, but he was also a very well-known futurologist, not just some random smart guy who happened to make some predictions. I would be quite surprised to hear about anyone else from the 60s with a better futurology record than him.
That said, I am still quite convinced that the average smart person would still make terrible predictions about the long-term future. The best example I can offer is this, one of the rare set of illustrations that got printed in 1899 France to imagine what France would look like in the year 2000. Of course, the vast majority of these predictions were comically bad.
It is worth to notice that we mainly know about these postcards because Asimov himself published a book about them in the 80s (this is not a coincidence because nothing is ever a coincidence).
- Lalartu2 19 Sep 2022 11:20 UTC
  0 points
  −4
  Parent
  I disagree that “France in the Year 2000” predictions were wrong. If judged by function rather than aesthetics they are more than half accurate.
Jiro 2 Jul 2022 3:42 UTC
4 points
1
Asimov’s laser beams for communication deserves to be a 1, assuming that 1 means ambiguous/near miss. Fiber optics are a thing, even if they don’t actually use lasers. 3D TV was a thing around 2014 as well, and probably deserves a 1, even if it’s not in cubes.
- technicalities 2 Jul 2022 12:54 UTC
  3 points
  3
  Parent
  From context I think he meant not fibre laser but “free-space optics”, a then-hyped application of lasers to replace radio. I get this from him mentioning it in the same sentence as satellites and then comparing lasers to radio: “A continuing advance of communications satellites, and the use of laser beams for communication in place of electric currents and radio waves. A laser beam of visible light is made up of waves that are millions of times shorter than those of radio waves”. So I don’t think this rises above the background radiation (ha) of Asimov’s vagueness.
  As for 3D TV, if I expand the context you see it’s an explicit replacement for screens: “wall screens will have replaced the ordinary set; but transparent cubes will be making their appearance in which three-dimensional viewing will be possible. In fact, one popular exhibit at the 2014 World’s Fair will be such a 3-D TV, built life-size, in which ballet performances will be seen. The cube will slowly revolve for viewing from all angles.” Also my understanding is that our 3D TVs don’t allow any varying POV, let alone all angles.
  Thanks! Added these to the changelog.
  - simon 2 Jul 2022 17:53 UTC
    1 point
    0
    Parent
    “free-space optics”
    While it’s not our main communications method, infrared communication is a thing, and it’s a lot closer to visible than radio.
    Also, Elon Musk claims that SpaceX is going to enable laser links for inter-satellite communications between Starlink satellites soon (admittedly, not within the 2020 target year, but this is still pretty close!)
    
    As for 3D TV, if I expand the context you see it’s an explicit replacement for screens
    My reading of the context is that screens are supposed to be the predominant form, and cube 3d is a prototype. This seems to be a correct prediction: see “crystal cube” here.
Unnamed 1 Jul 2022 18:50 UTC
4 points
1
I recall hearing a claim that a lot of Kurzweil’s predictions for 2009 had come true by 2019, including many that hadn’t happened yet in 2009. If true, that supports the picture of Kurzweil as an insightful but overly aggressive futurist. But I don’t know how well that claim backed up by the data, or if there even has been a careful look at the data to try to evaluate that claim.
- yagudin 4 Jul 2022 16:16 UTC
  1 point
  0
  Parent
  See:
HarrisonDurland 5 Jul 2022 0:13 UTC
3 points
0
I might have missed mention of this somewhere, but I think that some kind of analysis that provides some context on “what did the skeptics at the time say—especially for forecasts that resolved incorrectly vs. correctly” would be quite nice: I think it’s potentially helpful to get a model of “(how often/when) were skeptics on the right side of the forecast, and were they accurate for reasons that ended up proving true?” Additionally, some case studies of examples to determine “were they justified for thinking the way they did” while excluding hindsight bias might be difficult, but similarly helpful.
Suppose hypothetically that the findings were something like “When futurists were on the right side of 50% but many of their contemporaries were skeptical at the time, it often was the case that the skepticism was not very engaged/persuasive/grounded (e.g., it was largely based on initial objections to which the futurists provided responses that went unaddressed by the skeptics; making assumptions that were verifiably wrong given available information at the time).” It seems quite improbable that you would get such a neat finding, but if the findings did vaguely resemble this—or if there were at least some not-misrepresentative anecdotes to this effect—then that could be a useful thing to highlight when discussing skepticism towards AGI predictions.
HarrisonDurland 4 Jul 2022 23:54 UTC
2 points
0
Another long-term forecast evaluation study which I don’t think was mentioned (but might have simply missed): “Long-term forecasts of military technologies for a 20–30 year horizon: An empirical assessment of accuracy” ( https://www.sciencedirect.com/science/article/abs/pii/S0040162518304438?via%3Dihub ).
Forecast evaluation is often a messy endeavor, as I learned trying to do research on forecasting for S&T last summer (which is what led me to that article).
- HoldenKarnofsky 17 Mar 2023 3:16 UTC
  3 points
  1
  Parent
  This is included! It’s linked from the second-to-last paragraph.
Gerald Monroe 17 Mar 2023 4:24 UTC
1 point
0
- Incorrect: “transparent cubes will be making their appearance in which three-dimensional viewing will be possible. In fact, one popular exhibit at the 2014 World’s Fair will be such a 3-D TV, built life-size, in which ballet performances will be seen. The cube will slowly revolve for viewing from all angles.” Doesn’t seem ridiculous, but doesn’t seem right. Of course, a side point here is that he refers to the 2014 World’s Fair, which didn’t happen.
Yes, but...2014 was the second year of the second VR craze. The Occulus Rift dev kit 2 was shipping, and it was easily capable of showing a life size 3d ballerina.
As for that kind of cubic 3d display, those have existed for decades. https://en.wikipedia.org/wiki/Spinning_mirror_system / Swept-volume display
So the prediction was wrong, but, we found a better way to give people 3d displays.
RationalActor 9 Aug 2022 20:24 UTC
1 point
0
Thank you for the interesting post, Holden!

The two key predictions that you make in your “Wild Century” blog series, as I understand it, are:
- an AI that will be able to do the process of scientific inquiry (much) better and faster than humans, leading to a productivity explosion, and super-fundamental societal change
- digital people/mind-uploading
…this century (plausibly).

These feel much bigger/wilder/more dramatic/more fundamental than the examples given here. This makes me a bit sceptical of how useful the evidence (entertainingly and fascinatingly) assembled in this post is.
Flaglandbase 2 Jul 2022 23:45 UTC
1 point
0
Aerospace predictions were too optimistic:
Clarke predicted intercontinental hypersonic airliners in the 1970s (“Death and the Senator” 1961) . Heinlein predicted a base on Pluto established in the year 2000. Asimov only predicted suborbital space flights at very low acceleration that casual day tourists would line up to take from New York in the 1990s, but also sentient non-mobile talking robots and non-talking sentient mobile robots by that decade. Robert Forward predicted in the novel Rocheworld (1984) that the first unmanned space probe would return pictures from Barnard’s Star in 2022 (though the images wouldn’t arrive back on Earth till 2028).
On the flip side:
Clarke predicted in “Childhood’s End” that it would take extensive searching through a specialized library (where you had to make an appointment through your university and show up in person) just to identify an astronomical catalog number in the 21st century. It would also take VERY expensive computer time with a worldwide waiting list to analyze the trajectory of a comet-like object in the novel “Rendezvous with Rama”. That’s because comets follow complex hyperbolic trajectories that require calculus far too difficult for humans to solve with pen and paper.
concernedcitizen64 3 Jul 2022 21:36 UTC
−4 points
2
isaac asimov was a snacc tbh
- Ben Pace 5 Jul 2022 5:06 UTC
  6 points
  0
  Parent
  I’m amused that we have action in the agree-disagree voting here.