Claim: Scenario planning is preferable to quantitative forecasting for understanding and coping with AI progress

As part of my work for MIRI on forecasting, I’m considering the implications of what I’ve read up for the case of thinking about AI. My purpose isn’t to actually come to concrete conclusions about AI progress, but more to provide insight into what approaches are more promising and what approaches are less promising for thinking about AI progress.

I’ve written a post on general-purpose forecasting and another post on scenario analysis. In a recent post, I considered scenario analyses for technological progress. I’ve also looked at many domains of forecasting and at forecasting rare events. With the knowledge I’ve accumulated, I’ve shifted in the direction of viewing scenario analysis as a more promising tool than timeline-driven quantitative forecasting for understanding AI and its implications.

I’ll first summarize what I mean by scenario analysis and quantitative forecasting in the AI context. People who have some prior knowledge of the terms can probably skim through the summary quickly. Those who find the summary insufficiently informative, or want to delve deeper, are urged to read my more detailed posts linked above and the references therein.

Quantitative forecasting and scenario analysis in the AI context

The two approaches I am comparing are:

Quantitative forecasting: Here, specific predictions or forecasts are made, recorded, and later tested against what actually transpired. The forecasts are made in a form where it’s easy to score whether they happened. Probabilistic forecasts are also included. These are scored using one of the standard methods to score probabilistic forecasts (such as logarithmic scoring or quadratic scoring).
Scenario analysis: A number of scenarios of how the future might unfold are generated in considerable detail. Predetermined elements, common to the scenario, are combined with critical uncertainties, that vary between the scenarios. Early indicators that help determine which scenario will transpire are identified. In many cases, the goal is to choose strategies that are robust to all scenarios. For more, read my post on scenario analysis.

Quantitative forecasts are easier to score for accuracy, and in particular offer greater scope for falsification. This has perhaps attracted rationalists more to quantitative forecasting, as a way of distinguishing themselves from what appears to be the more wishy-washy realm of unfalsifiable scenario analysis. In this post, I argue that, given the considerable uncertainty surrounding progress in artificial intelligence, scenario analysis is a more apt tool.

There are probably some people on LessWrong who have high confidence in quantitative forecasts. I’m happy to make bets (financial or purely honorary) on such subjects. However, if you’re claiming high certainty while I am claiming uncertainty, I do want to have odds in my favor (depending on how much confidence you express in your opinion), for reasons similar to those that Bryan Caplan described here.

Below, I describe my reasons for preferring scenario analysis to forecasting.

#1: Considerable uncertainty

Proponents of the view that AI is scheduled to arrive in a few decades typically cite computing advances such as Moore’s law. However, there’s considerable uncertainty even surrounding short-term computing advances, as I described in my scenario analyses for technological progress. When it comes to the question of progress in AI, we have to combine uncertainties in hardware progress with uncertainties in software progress.

Quantitative forecasting methods, such as trend extrapolation, tend to do reasonably well, and might be better than nothing. But they are not foolproof. In particular, the impending death of Moore’s law, despite the trend staying quite robust for about 50 years, should make us cautious about too naive an extrapolation of trends. Arguably, simple trend extrapolation is still the best choice relative to other forecasting methods, at least as a general rule. But acknowledging uncertainty and considering multiple scenarios could prepare us a lot better for reality.

In a post in May 2013 titled When Will AI Be Created?, MIRI director Luke Muehlhauser (who later assigned me the forecasting project) looked at the wide range of beliefs about the time horizon for the arrival of human-level AI. Here’s how Luke described the situation:

To explore these difficulties, let’s start with a 2009 bloggingheads.tv conversation between MIRI researcher Eliezer Yudkowsky and MIT computer scientist Scott Aaronson, author of the excellent Quantum Computing Since Democritus. Early in that dialogue, Yudkowsky asked:

It seems pretty obvious to me that at some point in [one to ten decades] we’re going to build an AI smart enough to improve itself, and [it will] “foom” upward in intelligence, and by the time it exhausts available avenues for improvement it will be a “superintelligence” [relative] to us. Do you feel this is obvious?

Aaronson replied:

The idea that we could build computers that are smarter than us… and that those computers could build still smarter computers… until we reach the physical limits of what kind of intelligence is possible… that we could build things that are to us as we are to ants — all of this is compatible with the laws of physics… and I can’t find a reason of principle that it couldn’t eventually come to pass…

The main thing we disagree about is the time scale… a few thousand years [before AI] seems more reasonable to me.

Those two estimates — several decades vs. “a few thousand years” — have wildly different policy implications.

After more discussion of AI forecasts as well as some general findings on forecasting, Luke continues:

Given these considerations, I think the most appropriate stance on the question “When will AI be created?” is something like this:

We can’t be confident AI will come in the next 30 years, and we can’t be confident it’ll take more than 100 years, and anyone who is confident of either claim is pretending to know too much.

How confident is “confident”? Let’s say 70%. That is, I think it is unreasonable to be 70% confident that AI is fewer than 30 years away, and I also think it’s unreasonable to be 70% confident that AI is more than 100 years away.

This statement admits my inability to predict AI, but it also constrains my probability distribution over “years of AI creation” quite a lot.

I think the considerations above justify these constraints on my probability distribution, but I haven’t spelled out my reasoning in great detail. That would require more analysis than I can present here. But I hope I’ve at least summarized the basic considerations on this topic, and those with different probability distributions than mine can now build on my work here to try to justify them.

I believe that in the face of this considerable uncertainty, considering multiple scenarios, and the implications of each scenario, can be quite helpful.

#2: Isn’t scenario analysis unfalsifiable, and therefore unscientific? Why not aim for rigorous quantitative forecasting instead, that can be judged against reality?

First off, just because a forecast is quantitative doesn’t mean it is actually rigorous. I think it’s worthwhile to elicit and record quantitative forecasts. These can have high value for near-term horizons, and can provide a rough idea of the range of opinion for longer timescales.

However, simply phoning up experts to ask them for their timelines, or sending them an Internet survey, is not too useful. Tetlock’s work, described in Muehlhauser’s post and in my post on historical evaluations of forecasting, shows that unaided expert judgment has little value. Asking people who haven’t thought through the issue to come up with numbers can give a fake sense of precision with little accuracy (and little genuine precision, either, if we consider the diverse range of responses from different experts). On the other hand, eliciting detailed scenarios from experts can force them to think more clearly about the issues and the relationships between them. Note that there are dangers to eliciting detailed scenarios: people may fall into their own make-believe world. But I think the trade-off with the uncertainty in quantitative forecasting still points in favor of scenario analysis.

Explicit quantitative forecasts can be helpful when people have an opportunity to learn from wrong forecasts and adjust their methodology accordingly. Therefore, I argue that if we want to go down the quantitative forecasting route, it’s important to record forecasts about the near and medium future instead of or in addition to forecasts about the far future. Also, providing experts some historical information and feedback at the time they make their forecasts can help reduce the chances of them simply saying things without reflecting. Depending on the costs of recording forecasts, it may be worthwhile to do so anyway, even if we don’t have high hopes that the forecasts will yield value. Broadly, I agree with Luke’s suggestions:

Explicit quantification: “The best way to become a better-calibrated appraiser of long-term futures is to get in the habit of making quantitative probability estimates that can be objectively scored for accuracy over long stretches of time. Explicit quantification enables explicit accuracy feedback, which enables learning.”
Signposting the future: Thinking through specific scenarios can be useful if those scenarios “come with clear diagnostic signposts that policymakers can use to gauge whether they are moving toward or away from one scenario or another… Falsifiable hypotheses bring high-flying scenario abstractions back to Earth.”¹³
Leveraging aggregation: “the average forecast is often more accurate than the vast majority of the individual forecasts that went into computing the average…. [Forecasters] should also get into the habit that some of the better forecasters in [an IARPA forecasting tournament called ACE] have gotten into: comparing their predictions to group averages, weighted-averaging algorithms, prediction markets, and financial markets.” See Ungar et al. (2012) for some aggregation-leveraging results from the ACE tournament.

But I argue that the bulk of the effort should go into scenario generation and scenario analysis. Even here, the problem of absence of feedback is acute: we can design scenarios all we want for what will happen over the next century, but we can’t afford to wait a century to know if our scenarios transpired. Therefore, it makes sense to break the scenario analysis exercises into chunks of 10-15 years. For instance, one scenario analysis could consider scenarios for the next 10-15 years. For each of the scenarios, we can have a separate scenario analysis exercise that considers scenarios for the 10-15 years after that. And so on. Note that the number of scenarios increases exponentially with the time horizon, but this is simply a reflection of the underlying complexity and uncertainty. In some cases, scenarios could “merge” at later times, as scenarios with slow early progress and fast later progress yield the same end result that scenario with fast early progress and slow later progress do.

#3: Evidence from other disciplines

Explicit quantitative forecasting is common in many disciplines, but the more we look at longer time horizons, and the more uncertainty we are dealing with, the more common scenario analysis becomes. I considered many examples of scenario analysis in my scenario analysis post. As you’ll see from the list there, scenario analysis, and variants of it, have become influential in areas ranging from climate change (as seen in IPCC reports) to energy to macroeconomic and fiscal analysis to land use and transportation analysis. And big consulting companies such as McKinsey & Company use scenario analysis frequently in their reports.

It’s of course possible to argue that the use of scenario analyses is a reflection of human failing: people don’t want to make single forecasts because they are afraid of being proven wrong, or of contradicting other people’s beliefs about the future. Or maybe people are shy of thinking quantitatively. I think there is some truth to such a critique. But until we have human-level AI, we have to rely on the failure-prone humans for input on the question of AI progress. Perhaps scenario analysis is superior to quantitative forecasting because humans are insufficiently rational, but to the extent it’s superior, it’s superior.

Addendum: What are the already existing scenario analyses for artificial intelligence?

I had a brief discussion with Luke Muehlhauser and some of the names below were suggested by him, but I didn’t run the final list by him. All responsibility for errors is mine.

To my knowledge (and to the knowledge of people I’ve talked to) there are no formal scenario analyses of Artificial General Intelligence structured in a manner similar to the standard examples of scenario analyses. However, if scenario analysis is construed sufficiently loosely as a discussion of various predetermined elements and critical uncertainties and a brief mention of different possible scenarios, then we can list a few scenario analyses:

Nick Bostrom’s book Superintelligence (released in the UK and on Kindle, but not released as a print book in the US at the time of this writing) discusses several scenarios for paths to AGI.
Eliezer Yudkowsky’s report on Intelligence Explosion Microeconomics (93 pages, direct PDF link) can be construed as an analysis of AI scenarios.
Robin Hanson’s forthcoming book on em economics discusses one future scenario that is somewhat related to AI progress.
The Hanson-Yudkowsky AI Foom debate includes a discussion of many scenarios.

The above are scenario analyses for the eventual properties and behavior of an artificial general intelligence, rather than scenario analyses for the immediate future. The work of Ray Kurzwzeil can be thought of as a scenario analysis that lays out an explicit timeline from now to the arrival of AGI.