Three reasons to expect long AI timelines

Link post

A lot of people I trust put relatively high confidence on near term AI timelines. For example, most people in the Lesswrong AI timelines thread from last year had shorter timelines than me, though it may have been because I interpreted the question a bit differently than most everyone else.

In this post, I’ll cover three big reasons to expect long AI timelines, which I take to be the thesis that transformative AI-related phenomena won’t happen for at least another 50 years (currently >2071). Roughly speaking, the three reasons are

  1. Technological deployment lag: Most technologies take decades between when they’re first developed and when they become widely impactful.

  2. Overestimating the generality of AI technology: Many AI scientists in the 1950s and 1960s incorrectly expected that cracking computer chess would automatically crack other tasks as well. I suspect a similar phenomenon is happening in people’s minds today when they extrapolate current AI.

  3. Regulation will slow things down: Lots of big technologies have been slowed down by regulation. Nuclear energy is an obvious example of a technology whose adoption has been hampered by regulation, but other cases exist too.

Technological deployment lag

From at least two perspectives, it makes sense to care more about when a technology is impactful, rather than when it first gets developed in a lab.

The first perspective is the perspective of an ordinary person. Ordinary people hardly get affected by isolated technological achievements. It might help their stock portfolios, if the resulting development triggers investors to become much more optimistic about future profits in those industries. But other than that, the ordinary person will care far more about when they can actually see the results of a technology compared to when it is first developed.

The second perspective is the perspective of a serious technological forecaster. These people care a lot about timing because the order of technological developments matters a lot for policy. To give a simple example, they care a lot about whether cheap and reliable solar energy will be developed before fusion power, because it tells them what type of technology society should invest in to stop climate change.

I care most about the second perspective, though it’s worth noting cases where the first perspective might still matter. Consider a scenario in which AI researchers are going around declaring that “AGI is in 10 years.” 10 years pass and an AGI developed is indeed developed in a lab somewhere, but with no noticeable impact on everyday life. People may grow distrustful of such proclamations, even if they’re ultimately technically proven right.

While it might seem obvious to some that we should make a distinction between when a technology is developed and when it actually starts having a large impact, I’m pointing it out because I mostly don’t get the impression from most AI forecasting literature that such a distinction is important.

Nearly all AI timeline surveys and forecasts I’ve been acquainted with simply take it as a starting assumption that what we care about is when advanced AI is developed, somewhere, rather than some side effect of that development. While I admit that it might be more reasonable to care about the specific moment in time when advanced AI is developed in a lab (particularly if we accept some local “foom” AI takeoff scenarios), it’s not at all obvious to me that it is. If you disagree, I would prefer you to at least carefully outline your reasoning before spitting out a date.

One main reason why we might care most about the date of development is because we think that after sufficiently advanced AI is developed, the effects will happen almost instantaneously. The most extreme version of this thesis is the one where AI self-improves upon getting past some critical threshold, and takes over the whole world within a few weeks.

Most technologies, however, don’t generally have immediate widespread effects. Our World in Data produced a great chart showing the typical timescales for technological adoption. It’s worth checking out the whole article.

We can see from the chart that most technologies take decades between the time when just a few have access to it, and when they’re ubiquitous. The chart likely underestimates the true lag between development and impact, however, because we also need to take into account the lag between when technologies are developed and when a non-negligible fraction of the population has access to them. Furthermore, this chart only shows adoption in the United States, a rich nation. Adoption trends worldwide are even slower.

(Consider that, at the time of writing, predictors on Metaculus expect transformative economic growth to come long after they expect the first AGI to be developed).

One objection is that we should care about AI’s impacts long before it becomes ubiquitous in households—it might be adopted by businesses and governments first.

There are two forms this objection might take. The first form imagines that businesses or governments would be much faster to adopt technologies than households. I am uncertain about the strength of this objection, and I’m not sure what information might be relevant for answering it.

The second form of this objection is that AI might have huge transformative impacts even if only a few businesses or governments adopt it. The classic justification for this thesis is that one or a few AI projects become overwhelmingly powerful in a localized intelligence explosion, rather than having large effects by diffusion. In that case, all the standard arguments against AI foom are applicable here.

Another objection is that some technologies, especially smartphones, were adopted very rapidly compared to the other ones. AI is conceivably similar in this respect. The rapid adoption of smartphones seems to derive from at least one of two reasons: it could be that smartphones were unusually affordable for households, or that they experienced unusually high demand upon their introduction.

It’s not clear to me whether AI technology will be unusually affordable relative to other technologies, and I lean towards doubting it. But it appears probable to me that AI will experience unusually high demand upon its introduction. Overall I’m not sure how to weight this consideration, but it definitely pushes me in the direction of thinking that AI technologies will probably not have a very long adoption timeline (say, more than 30 years after its introduction before it starts having large effects).

Another reason for doubting that AI will have immediate widespread impacts is because previous general purpose technologies failed to have such impacts too. Economist Robert Solow famously quipped in 1989 that “You can see the computer age everywhere but in the productivity statistics.” His observation was later coined the Productivity Paradox.

By the late 1990s, labor productivity in the United States had finally accelerated, culminating in an economic boom. Economists have provided a few explanations for this lag. For instance, Wikipedia points out that we may have simply mismeasured growth by overestimating inflation. Philippe Aghion and Peter Howitt, however, outline an alternative and common-sense explanation in in chapter 9 of The Economics of Growth,

As David (1990) and Lipsey and Bekar (1995) have argued, GPTs [general purpose technologies] like the steam engine, the electric dynamo, the laser, and the computer require costly restructuring and adjustment to take place, and there is no reason to expect this process to proceed smoothly over time. Thus, contrary to the predictions of real-business-cycle theory, the initial effect of a “positive technology shock” may not be to raise output, productivity, and employment but to reduce them [...]

An alternative explanation for slowdowns has been developed by Helpman and Trajtenberg (1998a) using the Schumpeterian apparatus where R&D resources can alternatively be used in production. The basic idea of this model is that GPTs do not come ready to use off the shelf. Instead, each GPT requires an entirely new set of intermediate goods before it can be implemented. The discovery and development of these intermediate goods is a costly activity, and the economy must wait until some critical mass of intermediate components has been accumulated before it is profitable for firms to switch from the previous GPT. During the period between the discovery of a new GPT and its ultimate implementation, national income will fall as resources are taken out of production and put into R&D activities aimed at the discovery of new intermediate input components.

In line with these expecations, Daniel Kokotajlo pointed to this paper which complements this analysis by applying it to the current machine learning era.

Overestimating the generality of AI technology

Many very smart AI scientists in the 1950s and 1960s had once believed that human-level AI was imminent. As many later pointed out, these failed predictions by themselves provide evidence that AI will take longer to develop than we think. Yet, that’s not the only reason why I’m bringing them up.

Instead, I want to focus on why AI scientists once believed that developing human-level AI would be relatively easy. The main reason, I suspect, is that researchers were too optimistic about the generality of their techniques. The case of computer chess is illustrative here. In his 1950 paper in which he provided an algorithm for perfect chess play, Claude Shannon wrote,

This paper is concerned with the problem of constructing a computing routine or “program” for a modern general purpose computer which will enable it to play chess. Although perhaps of no practical importance, the question is of theoretical interest, and it is hoped that a satisfactory solution of this problem will act as a wedge in attacking other problems of a similar nature and of greater significance.

Among the problems that Shannon had hoped would be attacked indirectly by solving chess, he listed,

Machines capable of translating from one language to another.

Machines capable of orchestrating a melody.

We now know that these problems are at least, for all practical purposes, only incidentally related to the problem of playing chess. At most, these problems are downright irrelevant. Few AI researchers would make the same mistake today.

Yet, I see elements of Shannon’s mistake in the reasoning of many I see today. I’ll walk through my reasons.

First, consider why Shannon might have expected progress in chess to aid progress in language translation. We could imagine, in some abstract sense, that chess and language translation are both the same type of problems. Mathematically speaking, a chess engine is simply a mapping between chess board states and moves. Similarly, language translation is simply a mapping between sentences in one language, and sentences in another language.

Beyond the simple mathematical formalism, however, there are substantial real differences between the two tasks. While computer chess can feasibly be solved by brute force, language translation requires an extremely nuanced understanding of the rules native speakers use to compose their speech.

One reason why Shannon might not have given this argument much thought is because he wasn’t thinking about how to do language translation in the moment; he was more interested in solving chess, and the other problems were afterthoughts.

We can view his stance from the analogy to construal level theory, or as Lesswrong likes to put it, near vs. far thinking. All of the concrete ways that chess could be tackled were readily apparent in Claude Shannon’s mind, but the same could not be said about natural language translation. Rather than viewing a specific similarity between the two tasks, he could have made the forgivable mistake of assuming that a vague similarity between them was sufficient for his prediction.

It’s a bit like the planning fallacy. When planning our time, we can see all the ways things could go right and according to schedule, since those things are concrete. The ways that things could go wrong are more abstract, and thus occupy less space in our thinking. We mistake this perception for the likelihood of things going right.

Now let’s compare this case to an argument I hear quite a lot these days. Consider the quite reasonable suggestion that GPT-3 is a rudimentary form of general intelligence. Given that it can write on a wide variety of topics, it certainly appears generally capable. Now consider one further assumption: the scaling hypothesis. We conclude that some descendant of GPT-3, given thousands or millions of times more computation, will naturally yield general AI.

I see no strong reason to doubt the narrow version of this thesis. I believe it’s likely that, as training scales, we’ll progressively see more general and more capable machine learning models that can do a ton of impressive things, both on the stuff we expect them to do well on, and some stuff we didn’t expect.

But no matter how hard I try, I don’t see any current way of making some descendant of GPT-3, for instance, manage a corporation.

One may reason that, as machine learning models scale and become more general, at some point this will just naturally yield the management skills required to run a company.

It’s important to note that even if this were true, it wouldn’t tell us much about how to extract those skills from the model. Indeed, GPT-3 may currently be skilled at many things that we nonetheless do not know how to make it actually perform.

Most importantly, notice the similarities between this reasoning and that of (my interpretation) of Claude Shannon’s. Shannon expected algorithmic progress in chess to transfer usefully to other domains. In my interpretation, he did this because the problems of chess were near to him, and the problems of language translation were far from him.

Similarly, the problem “write a well-written essay” is close to us. We can see concretely how to get a model to perform better at it, and we are much impressed by what we obtain by making progress. “Manage a corporation” is far. We’re not really sure how to approach it, even if we could point out vague similarities between the two problems if we tried.

I don’t mean to imply that we haven’t made progress on the task of getting an AI to manage a corporation. I only mean that you can’t just wish it away as a hard problem simply by imagining that we’ll just get it for free as a result of making steady progress on something simpler and more concrete.

What other tasks do I think people might be incorrectly assuming we could as a byproduct of progress on simpler things? Here’s a partial list,

  • As already stated, managing organizations and people.

  • Complex general purpose robotics, of the type needed to win the RoboCup grand challenge.

  • Long-term planning and execution, especially involving fine motor control and no guarantees about how the environment will be structured.

  • Making original and profound scientific discoveries.

I won’t claim that an AI can’t be dangerous to people if it lacks these abilities. However, I do think that in order to pose an existential risk to humanity, or obtain a decisive strategic advantage over humans, AI systems would likely need to be capable enough to do at least one of these things.

Regulation will slow things down

Recently, Jason Crawford wrote on Roots of Progress,

In the 1950s, nuclear was the energy of the future. Two generations later, it provides only about 10% of world electricity, and reactor design hasn’t fundamentally changed in decades.

As Crawford explains, the reason for this slow adoption is neither because nuclear plants are unsafe or because they can’t be built cheaply. Rather, burdensome regulation has raised production costs to a level where people would rather pay for other energy sources,

Excessive concern about low levels of radiation led to a regulatory standard known as ALARA: As Low As Reasonably Achievable. What defines “reasonable”? It is an ever-tightening standard. As long as the costs of nuclear plant construction and operation are in the ballpark of other modes of power, then they are reasonable.

This might seem like a sensible approach, until you realize that it eliminates, by definition, any chance for nuclear power to be cheaper than its competition. Nuclear can‘t even innovate its way out of this predicament: under ALARA, any technology, any operational improvement, anything that reduces costs, simply gives the regulator more room and more excuse to push for more stringent safety requirements, until the cost once again rises to make nuclear just a bit more expensive than everything else. Actually, it‘s worse than that: it essentially says that if nuclear becomes cheap, then the regulators have not done their job.

Crawford lays blame on the incentives of regulators. As he put it,

[The regulators] get no credit for approving new plants. But they do own any problems. For the regulator, there‘s no upside, only downside. No wonder they delay.

In fact, these perverse incentives facing regulators have long been known by economists who favor deregulation. Writing in 1980, Milton and Rose Friedman gave the following argument in the context of the FDA regulation,

It is no accident that the FDA, despite the best of intentions, operates to discourage the development and prevent the marketing of new and potentially useful drugs. Put yourself in the position of an FDA official charged with approving or disapproving a new drug. You can make two very different mistakes:

1. Approve a drug that turns out to have unanticipated side effects resulting in the death or serious impairment of a sizable number of persons.

2. Refuse approval of a drug that is capable of saving many lives or relieving great distress and that has no untoward side effects.

If you make the first mistake—approve a thalidomide—your name will be spread over the front page of every newspaper. You will be in deep disgrace. If you make the second mistake, who will know it?

Given the moral case here, it might come as a surprise that the effect of regulation on technological innovation has not generally been well studied. Philippe Aghion et al. recently published a paper saying as much in their introduction. Still, although we lack a large literature to show the role regulation plays to delay technological development, it almost certainly does.

Regulation is arguably the main thing standing in the way of lots of futuristic technologies: human cloning, human genetic engineering, and climate engineering come to mind, just to name a few.

One might think that the AI industry is immune to such regulation, or nearly so. After all, the tech industry has historically experienced a lot of growth without much government interference. What reason is there for this to stop?

I offer two replies. The first reason is that governments of the world are already on the cusp of a concerted effort to regulate technology companies. A New York Times article from April 20th explains,

China fined the internet giant Alibaba a record $2.8 billion this month for anticompetitive practices, ordered an overhaul of its sister financial company and warned other technology firms to obey Beijing’s rules.

Now the European Commission plans to unveil far-reaching regulations to limit technologies powered by artificial intelligence.

And in the United States, President Biden has stacked his administration with trustbusters who have taken aim at Amazon, Facebook and Google.

Around the world, governments are moving simultaneously to limit the power of tech companies with an urgency and breadth that no single industry had experienced before. Their motivation varies. In the United States and Europe, it is concern that tech companies are stifling competition, spreading misinformation and eroding privacy; in Russia and elsewhere, it is to silence protest movements and tighten political control; in China, it is some of both.

The second reason is that, as AI becomes more capable, we’ll likely increasingly see calls for it to be regulated. I should point out that I’m not restricting my analysis to government regulation; the very fact that the AI safety community exists, and that OpenAI and Deepmind hired people to work on safety, provides evidence that such calls for more caution will occur.

The slightest sign of danger was enough to stall nuclear energy development. I don’t see much reason to expect any different for AI.

Furthermore, many others and I, have previously pointed out that in a continuous AI takeoff scenario, low-magnitude AI failures will happen before large-magnitude failures. It seems plausible to me that at some point, a significant AI failure will happen that triggers a national or even international panic, despite not posing any sort of imminent existential risk. In other words, I pretty much expect a Chernobyl disaster of AI—or at least, I expect a series of such disasters to happen that will have more or less the same effect.


Combining all three of these effects, it’s a bit difficult to see how we will get transformative AI developments in the next 50 years. Even accepting some of the more optimistic assumptions in e.g. Ajeya Cotra’s Draft report on AI timelines, it still seems to me that these effects will add a few decades to our timelines before things get really interesting. So at present, my optimistic timelines look more like 25 or 30 years, rather than 10 or 15. But of course, smart people disagree with me here, there’s a ton of uncertainty, so I’m happy to find where I made mistakes.