Elites and AI: Stated Opinions

Previously, I asked “Will the world’s elites navigate the creation of AI just fine?” My current answer is “probably not,” but I think it’s a question worth additional investigation.

As a preliminary step, and with the help of MIRI interns Jeremy Miller and Oriane Gaillard, I’ve collected a few stated opinions on the issue. This survey of stated opinions is not representative of any particular group, and is not meant to provide strong evidence about what is true on the matter. It’s merely a collection of quotes we happened to find on the subject. Hopefully others can point us to other stated opinions — or state their own opinions.

MIRI researcher Eliezer Yudkowsky is famously pessimistic on this issue. For example, in a 2009 comment, he replied to the question “What kind of competitive or political system would make fragmented squabbling AIs safer than an attempt to get the monolithic approach right?” by saying “the answer is, ‘None.’ It’s like asking how you should move your legs to walk faster than a jet plane” — again, implying extreme skepticism that political elites will manage AI properly.¹

Cryptographer Wei Dai is also quite pessimistic:

...even in a relatively optimistic scenario, one with steady progress in AI capability along with apparent progress in AI control/safety (and nobody deliberately builds a UFAI for the sake of “maximizing complexity of the universe” or what have you), it’s probably only a matter of time until some AI crosses a threshold of intelligence and manages to “throw off its shackles”. This may be accompanied by a last-minute scramble by mainstream elites to slow down AI progress and research methods of scalable AI control, which (if it does happen) will likely be too late to make a difference.

Stanford philosopher Ken Taylor has also expressed pessimism, in an episode of Philosophy Talk called “Turbo-charging the mind”:

Think about nuclear technology. It evolved in a time of war… The probability that nuclear technology was going to arise at a time when we use it well rather than [for] destruction was low… Same thing with… superhuman artificial intelligence. It’s going to emerge… in a context in which we make a mess out of everything. So the probability that we make a mess out of this is really high.

Here, Taylor seems to express the view that humans are not yet morally and rationally advanced enough to be trusted with powerful technologies. This general view has been expressed before by many others, including Albert Einstein, who wrote that “Our entire much-praised technological progress… could be compared to an axe in the hand of a pathological criminal.”

In response to Taylor’s comment, MIRI researcher Anna Salamon (now Executive Director of CFAR) expressed a more optimistic view:

I… disagree. A lot of my colleagues would [agree with you] that 40% chance of human survival is absurdly optimistic… But, probably we’re not close to AI. Probably by the time AI hits we will have had more thinking going into it… [Also,] if the Germans had successfully gotten the bomb and taken over the world, there would have been somebody who profited. If AI runs away and kills everyone, there’s nobody who profits. There’s a lot of incentive to try and solve the problem together...

Economist James Miller is another voice of pessimism. In Singularity Rising, chapter 5, he worries about game-theoretic mechanisms incentivizing speed of development over safety of development:

Successfully creating [superhuman AI] would give a country control of everything, making [superhuman AI] far more militarily useful than mere atomic weapons. The first nation to create an obedient [superhuman AI] would also instantly acquire the capacity to terminate its rivals’ AI development projects. Knowing the stakes, rival nations might go full throttle to win [a race to superhuman AI], even if they understood that haste could cause them to create a world-destroying [superhuman AI]. These rivals might realize the danger and desperately wish to come to an agreement to reduce the peril, but they might find that the logic of the widely used game theory paradox of the Prisoners’ Dilemma thwarts all cooperation efforts… Imagine that both the US and Chinese militaries want to create [superhuman AI]. To keep things simple, let’s assume that each military has the binary choice to proceed either slowly or quickly. Going slowly increases the time it will take to build [superhuman AI] but reduces the likelihood that it will become unfriendly and destroy humanity. The United States and China might come to an agreement and decide that they will both go slowly… [But] if the United States knows that China will go slowly, it might wish to proceed quickly and accept the additional risk of destroying the world in return for having a much higher chance of being the first country to create [superhuman AI]. (During the Cold War, the United States and the Soviet Union risked destroying the world for less.) The United States might also think that if the Chinese proceed quickly, then they should go quickly, too, rather than let the Chinese be the likely winners of the… race.

In chapter 6, Miller expresses similar worries about corporate incentives and AI:

Paradoxically and tragically, the fact that [superhuman AI] would destroy mankind increases the chance of the private sector developing it. To see why, pretend that you’re at the racetrack deciding whether to bet on the horse Recursive Darkness. The horse offers a good payoff in the event of victory, but her odds of winning seem too small to justify a bet—until, that is, you read the fine print on the racing form: “If Recursive Darkness loses, the world ends.” Now you bet everything you have on her because you realize that the bet will either pay off or become irrelevant.

Miller expanded on some of these points in his chapter in Singularity Hypotheses.

In a short reply to Miller, GMU economist Robin Hanson wrote that

[Miller’s analysis is] only as useful as the assumptions on which it is based. Miller’s chosen assumptions seem to me quite extreme, and quite unlikely.

Unfortunately, Hanson does not explain his reasons for rejecting Miller’s analysis.

Sun Microsystems co-founder Bill Joy is famous for the techno-pessimism of his Wired essay “Why the Future Doesn’t Need Us,” but that article’s predictions about elites’ likely handling of AI are actually somewhat mixed:

we all wish our course could be determined by our collective values, ethics, and morals. If we had gained more collective wisdom over the past few thousand years, then a dialogue to this end would be more practical, and the incredible powers we are about to unleash would not be nearly so troubling.

One would think we might be driven to such a dialogue by our instinct for self-preservation. Individuals clearly have this desire, yet as a species our behavior seems to be not in our favor. In dealing with the nuclear threat, we often spoke dishonestly to ourselves and to each other, thereby greatly increasing the risks. Whether this was politically motivated, or because we chose not to think ahead, or because when faced with such grave threats we acted irrationally out of fear, I do not know, but it does not bode well.

The new Pandora’s boxes of genetics, nanotechnology, and robotics are almost open, yet we seem hardly to have noticed… Churchill remarked, in a famous left-handed compliment, that the American people and their leaders ‘invariably do the right thing, after they have examined every other alternative.’ In this case, however, we must act more presciently, as to do the right thing only at last may be to lose the chance to do it at all...

...And yet I believe we do have a strong and solid basis for hope. Our attempts to deal with weapons of mass destruction in the last century provide a shining example of relinquishment for us to consider: the unilateral US abandonment, without preconditions, of the development of biological weapons. This relinquishment stemmed from the realization that while it would take an enormous effort to create these terrible weapons, they could from then on easily be duplicated and fall into the hands of rogue nations or terrorist groups.

Former GiveWell researcher Jonah Sinick has expressed optimism on the issue:

I personally am optimistic about the world’s elites navigating AI risk as well as possible subject to inherent human limitations that I would expect everybody to have, and the inherent risk. Some points:

I’ve been surprised by people’s ability to avert bad outcomes. Only two nuclear weapons have been used since nuclear weapons were developed, despite the fact that there are 10,000+ nuclear weapons around the world. Political leaders are assassinated very infrequently relative to how often one might expect a priori.

AI risk is a Global Catastrophic Risk in addition to being an x-risk. Therefore, even people who don’t care about the far future will be motivated to prevent it.

The people with the most power tend to be the most rational people, and the effect size can be expected to increase over time… The most rational people are the people who are most likely to be aware of and to work to avert AI risk...

Availability of information is increasing over time. At the time of the Dartmouth conference, information about the potential dangers of AI was not very salient, now it’s more salient, and in the future it will be still more salient...

In the Manhattan project, the “will bombs ignite the atmosphere?” question was analyzed and dismissed without much (to our knowledge) double-checking. The amount of risk checking per hour of human capital available can be expected to increase over time. In general, people enjoy tackling important problems, and risk checking is more important than most of the things that people would otherwise be doing.

Paul Christiano is another voice of optimism about elites’ handling of AI. Here are some snippets from his “mainline” scenario for AI development:

It becomes fairly clear some time in advance, perhaps years, that broadly human-competitive AGI will be available soon. As this becomes obvious, competent researchers shift into more directly relevant work, and governments and researchers become more concerned with social impacts and safety issues...

Call the point where the share of human workers is negligible point Y. After Y humans are very unlikely to maintain control over global economic dynamics—the effective population is overwhelmingly dominated by machine intelligences… This picture becomes clear to serious onlookers well in advance of the development of human-level AGI… [hence] there is much intellectual activity aimed at understanding these dynamics and strategies for handling them, carried out both in public and within governments.

Why should we expect the control problem to be solved? …at each point when we face a control problem more difficult than any we have faced so far and with higher consequences for failure, we expect to have faced slightly easier problems with only slightly lower consequences for failure in the past.

As long as solutions to the control problem are not quite satisfactory, the incentives to resolve control problems are comparable to the incentives to increase the capabilities of systems. If solutions are particularly unsatisfactory, then incentives to resolve control problems are very strong. So natural economic incentives build a control system (in the traditional sense from robotics) which keeps solutions to the control problem from being too unsatisfactory.

Christiano is no Polyanna, however. In the same document, he outlines “what could go wrong,” and what we might do about it.

Notes

¹ I originally included another quote from Eliezer, but then I noticed that other readers on Less Wrong had elsewhere interpreted that same quote differently than I had, so I removed it from this post.