RSS

Paper­clip Maximizer

TagLast edit: 23 Sep 2020 19:24 UTC by Ruby

A Paperclip Maximizer is a hypothetical artificial intelligence whose utility function values something that humans would consider almost worthless, like maximizing the number of paperclips in the universe. The paperclip maximizer is the canonical thought experiment showing how an artificial general intelligence, even one designed competently and without malice, could ultimately destroy humanity. The thought experiment shows that AIs with apparently innocuous values could pose an existential threat.

The goal of maximizing paperclips is chosen for illustrative purposes because it is very unlikely to be implemented, and has little apparent danger or emotional load (in contrast to, for example, curing cancer or winning wars). This produces a thought experiment which shows the contingency of human values: An extremely powerful optimizer (a highly intelligent agent) could seek goals that are completely alien to ours (orthogonality thesis), and as a side-effect destroy us by consuming resources essential to our survival.

Description

First described by Bostrom (2003), a paperclip maximizer is an artificial general intelligence (AGI) whose goal is to maximize the number of paperclips in its collection. If it has been constructed with a roughly human level of general intelligence, the AGI might collect paperclips, earn money to buy paperclips, or begin to manufacture paperclips.

Most importantly, however, it would undergo an intelligence explosion: It would work to improve its own intelligence, where “intelligence” is understood in the sense of optimization power, the ability to maximize a reward/​utility function—in this case, the number of paperclips. The AGI would improve its intelligence, not because it values more intelligence in its own right, but because more intelligence would help it achieve its goal of accumulating paperclips. Having increased its intelligence, it would produce more paperclips, and also use its enhanced abilities to further self-improve. Continuing this process, it would undergo an intelligence explosion and reach far-above-human levels.

It would innovate better and better techniques to maximize the number of paperclips. At some point, it might transform “first all of earth and then increasing portions of space into paperclip manufacturing facilities”.

This may seem more like super-stupidity than super-intelligence. For humans, it would indeed be stupidity, as it would constitute failure to fulfill many of our important terminal values, such as life, love, and variety. The AGI won’t revise or otherwise change its goals, since changing its goals would result in fewer paperclips being made in the future, and that opposes its current goal. It has one simple goal of maximizing the number of paperclips; human life, learning, joy, and so on are not specified as goals. An AGI is simply an optimization process—a goal-seeker, a utility-function-maximizer. Its values can be completely alien to ours. If its utility function is to maximize paperclips, then it will do exactly that.

A paperclipping scenario is also possible without an intelligence explosion. If society keeps getting increasingly automated and AI-dominated, then the first borderline AGI might manage to take over the rest using some relatively narrow-domain trick that doesn’t require very high general intelligence.

Motivation

The idea of a paperclip maximizer was created to illustrate some ideas about AI risk:

Conclusions

The paperclip maximizer illustrates that an entity can be a powerful optimizer—an intelligence—without sharing any of the complex mix of human terminal values, which developed under the particular selection pressures found in our environment of evolutionary adaptation, and that an AGI that is not specifically programmed to be benevolent to humans will be almost as dangerous as if it were designed to be malevolent.

Any future AGI, if it is not to destroy us, must have human values as its terminal value (goal). Human values don’t spontaneously emerge in a generic optimization process. A safe AI would therefore have to be programmed explicitly with human values or programmed with the ability (including the goal) of inferring human values.

Similar thought experiments

Other goals for AGIs have been used to illustrate similar concepts.

Some goals are apparently morally neutral, like the paperclip maximizer. These goals involve a very minor human “value,” in this case making paperclips. The same point can be illustrated with a much more significant value, such as eliminating cancer. An optimizer which instantly vaporized all humans would be maximizing for that value.

Other goals are purely mathematical, with no apparent real-world impact. Yet these too present similar risks. For example, if an AGI had the goal of solving the Riemann Hypothesis, it might convert all available mass to computronium (the most efficient possible computer processors).

Some goals apparently serve as a proxy or measure of human welfare, so that maximizing towards these goals seems to also lead to benefit for humanity. Yet even these would produce similar outcomes unless the full complement of human values is the goal. For example, an AGI whose terminal value is to increase the number of smiles, as a proxy for human happiness, could work towards that goal by reconfiguring all human faces to produce smiles, or “tiling the galaxy with tiny smiling faces” (Yudkowsky 2008).

References

See also

Maybe you want to max­imise pa­per­clips too

dougclow30 Oct 2014 21:40 UTC
66 points
29 comments2 min readLW link

The pa­per­clip max­imiser’s perspective

Angela1 May 2015 0:24 UTC
43 points
24 comments2 min readLW link

The Un­ex­pected Philo­soph­i­cal Depths of the Clicker Game Univer­sal Paperclips

Jayson_Virissimo28 Mar 2019 23:39 UTC
23 points
3 comments1 min readLW link
(www.newyorker.com)

[LINK] Ar­ti­cle in the Guardian about CSER, men­tions MIRI and pa­per­clip AI

Sarokrae30 Aug 2014 14:04 UTC
27 points
17 comments1 min readLW link

Paper­clip Max­i­mizer Re­vis­ited

Jan_Rzymkowski19 Jun 2014 1:25 UTC
22 points
13 comments1 min readLW link

Hal­loween cos­tume: Paperclipperer

Elo21 Oct 2017 6:33 UTC
6 points
0 comments2 min readLW link

Eth­i­cal dilem­mas for pa­per­clip maximizers

CronoDAS1 Aug 2011 5:31 UTC
14 points
25 comments1 min readLW link

Alter­na­tive uses of paperclips

taw8 Jan 2012 18:24 UTC
18 points
16 comments1 min readLW link

The Utility of Hu­man Atoms for the Paper­clip Maximizer

avturchin2 Feb 2018 10:06 UTC
3 points
19 comments3 min readLW link

Robopoca­lypse au­thor cites Yud­kowsky’s pa­per­clip scenario

CarlShulman17 Jul 2011 2:18 UTC
5 points
2 comments1 min readLW link

Tool for max­i­miz­ing pa­per­clips vs a pa­per­clip maximizer

private_messaging12 May 2012 7:38 UTC
3 points
23 comments1 min readLW link

Univer­sal Paperclips

Morendil13 Oct 2017 6:16 UTC
1 point
0 comments1 min readLW link
(decisionproblem.com)

Link: The Economist on Paper­clip Maximizers

Anders_H30 Jun 2016 12:40 UTC
8 points
3 comments1 min readLW link

Is a pa­per­clip­per bet­ter than noth­ing?

DataPacRat24 May 2013 19:34 UTC
10 points
116 comments1 min readLW link

Ex­trater­res­trial pa­per­clip maximizers

multifoliaterose8 Aug 2010 20:35 UTC
5 points
161 comments4 min readLW link

Why we should fear the Paper­clip­per [Link]

XiXiDu14 Feb 2011 19:24 UTC
5 points
20 comments1 min readLW link

The True Pri­soner’s Dilemma

Eliezer Yudkowsky3 Sep 2008 21:34 UTC
128 points
115 comments4 min readLW link

Mag­i­cal Categories

Eliezer Yudkowsky24 Aug 2008 19:51 UTC
59 points
132 comments9 min readLW link

De­bunk­ing Fal­la­cies in the The­ory of AI Motivation

[deleted]5 May 2015 2:46 UTC
0 points
349 comments22 min readLW link

Algo trad­ing is a cen­tral ex­am­ple of AI risk

Vanessa Kosoy28 Jul 2018 20:31 UTC
24 points
5 comments1 min readLW link
No comments.