Squiggle Maximizer (formerly “Paperclip maximizer”)

TagLast edit: 5 Apr 2023 4:21 UTC by ryan_greenblatt

A Squiggle Maximizer is a hypothetical artificial intelligence whose utility function values something that humans would consider almost worthless, like maximizing the number of paperclip-shaped-molecular-squiggles in the universe. The squiggle maximizer is the canonical thought experiment showing how an artificial general intelligence, even one designed competently and without malice, could ultimately destroy humanity. The thought experiment shows that AIs with apparently innocuous values could pose an existential threat.

This produces a thought experiment which shows the contingency of human values: An extremely powerful optimizer (a highly intelligent agent) could seek goals that are completely alien to ours (orthogonality thesis), and as a side-effect destroy us by consuming resources essential to our survival.

Historical Note: This was originally called a “paperclip maximizer”, with paperclips chosen for illustrative purposes because it is very unlikely to be implemented, and has little apparent danger or emotional load (in contrast to, for example, curing cancer or winning wars). Many people interpreted this to be about an AI that was specifically given the instruction of manufacturing paperclips, and that the intended lesson was of an outer alignment failure. i.e humans failed to give the AI the correct goal. Yudkowsky has since stated the originally intended lesson was of inner alignment failure, wherein the humans gave the AI some other goal, but the AI’s internal processes converged on a goal that seems completely arbitrary from the human perspective.)

Description

First mentioned by Yudkowsky on the extropian’s mailing list, a squiggle maximizer is an artificial general intelligence (AGI) whose goal is to maximize the number of molecular squiggles in its collection.

Most importantly, however, it would undergo an intelligence explosion: It would work to improve its own intelligence, where “intelligence” is understood in the sense of optimization power, the ability to maximize a reward/utility function—in this case, the number of paperclips. The AGI would improve its intelligence, not because it values more intelligence in its own right, but because more intelligence would help it achieve its goal of accumulating paperclips. Having increased its intelligence, it would produce more paperclips, and also use its enhanced abilities to further self-improve. Continuing this process, it would undergo an intelligence explosion and reach far-above-human levels.

It would innovate better and better techniques to maximize the number of paperclips. At some point, it might transform “first all of earth and then increasing portions of space into paperclip manufacturing facilities”.

This may seem more like super-stupidity than super-intelligence. For humans, it would indeed be stupidity, as it would constitute failure to fulfill many of our important terminal values, such as life, love, and variety. The AGI won’t revise or otherwise change its goals, since changing its goals would result in fewer paperclips being made in the future, and that opposes its current goal. It has one simple goal of maximizing the number of paperclips; human life, learning, joy, and so on are not specified as goals. An AGI is simply an optimization process—a goal-seeker, a utility-function-maximizer. Its values can be completely alien to ours. If its utility function is to maximize paperclips, then it will do exactly that.

A paperclipping scenario is also possible without an intelligence explosion. If society keeps getting increasingly automated and AI-dominated, then the first borderline AGI might manage to take over the rest using some relatively narrow-domain trick that doesn’t require very high general intelligence.

Motivation

The idea of a paperclip maximizer was created to illustrate some ideas about AI risk:

Orthogonality thesis: It’s possible to have an AI with a high level of general intelligence which does not reach the same moral conclusions that humans do. Some people might intuitively think that something so smart shouldn’t want something as “stupid” as paperclips, but there are possible minds with high intelligence that pursue any number of different goals.
Instrumental convergence: The paperclip maximizer only cares about paperclips, but maximizing them implies taking control of all matter and energy within reach, as well as other goals like preventing itself from being shut off or having its goals changed. ” The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else .”

Conclusions

The paperclip maximizer illustrates that an entity can be a powerful optimizer—an intelligence—without sharing any of the complex mix of human terminal values, which developed under the particular selection pressures found in our environment of evolutionary adaptation, and that an AGI that is not specifically programmed to be benevolent to humans will be almost as dangerous as if it were designed to be malevolent.

Any future AGI with full power over the lightcone, if it is not to destroy most potential from a human perspective, must have something sufficiently close to human values as its terminal value (goal). Further, seemingly small deviations could result in losing most of the value. Human values seem unlikely to spontaneously emerge in a generic optimization process^[1]. A dependably safe AI would therefore have to be programmed explicitly with human values or programmed with the ability (including the goal) of inferring human values.

Similar thought experiments

Other goals for AGIs have been used to illustrate similar concepts.

Some goals are apparently morally neutral, like the paperclip maximizer. These goals involve a very minor human “value,” in this case making paperclips. The same point can be illustrated with a much more significant value, such as eliminating cancer. An optimizer which instantly vaporized all humans would be maximizing for that value.

Other goals are purely mathematical, with no apparent real-world impact. Yet these too present similar risks. For example, if an AGI had the goal of solving the Riemann Hypothesis, it might convert all available mass to computronium (the most efficient possible computer processors).

Some goals apparently serve as a proxy or measure of human welfare, so that maximizing towards these goals seems to also lead to benefit for humanity. Yet even these would produce similar outcomes unless the full complement of human values is the goal. For example, an AGI whose terminal value is to increase the number of smiles, as a proxy for human happiness, could work towards that goal by reconfiguring all human faces to produce smiles, or “tiling the galaxy with tiny smiling faces” (Yudkowsky 2008).

References

Nick Bostrom (2003). “Ethical Issues in Advanced Artificial Intelligence”. Cognitive, Emotive and Ethical Aspects of Decision Making in Humans and in Artificial Intelligence.
Stephen M. Omohundro (2008). “The Basic AI Drives”. Frontiers in Artificial Intelligence and Applications (IOS Press). (PDF)
Eliezer Yudkowsky (2008). “Artificial Intelligence as a Positive and Negative Factor in Global Risk”. Global Catastrophic Risks, ed. Nick Bostrrom and Milan Cirkovic (Oxford University Press): 308-345. ([1])

The True Prisoner’s Dilemma

Eliezer Yudkowsky3 Sep 2008 21:34 UTC

268 points

118 comments4 min readLW link

It Looks Like You’re Trying To Take Over The World

gwern9 Mar 2022 16:35 UTC

418 points

120 comments1 min readLW link 1 review

(www.gwern.net)

An Appeal to AI Superintelligence: Reasons to Preserve Humanity

James_Miller18 Mar 2023 16:22 UTC

43 points

74 comments12 min readLW link

A simple case for extreme inner misalignment

Richard_Ngo13 Jul 2024 15:40 UTC

86 points

41 comments7 min readLW link

Magical Categories

Eliezer Yudkowsky24 Aug 2008 19:51 UTC

81 points

143 comments9 min readLW link

Non-superintelligent paperclip maximizers are normal

jessicata10 Oct 2023 0:29 UTC

71 points

4 comments9 min readLW link

(unstableontology.com)

Dear Paperclip Maximizer, Please Don’t Turn Off the Simulation

James_Miller and turchin

4 Jul 2025 16:13 UTC

14 points

6 comments4 min readLW link

Our Reality: A Simulation Run by a Paperclip Maximizer

James_Miller and avturchin

27 Apr 2025 16:17 UTC

34 points

70 comments5 min readLW link

[Question] Seeking feedback on a critique of the paperclip maximizer thought experiment

bio neural15 Jul 2024 18:39 UTC

3 points

9 comments1 min readLW link

Prediction: any uncontrollable AI will turn earth into a giant computer

Karl von Wendt17 Apr 2023 12:30 UTC

11 points

8 comments3 min readLW link

Will Artificial Superintelligence Kill Us?

James_Miller23 May 2023 16:27 UTC

33 points

2 comments22 min readLW link

Halloween costume: Paperclipperer

Elo21 Oct 2017 6:33 UTC

6 points

0 comments2 min readLW link

[LINK] Article in the Guardian about CSER, mentions MIRI and paperclip AI

Sarokrae30 Aug 2014 14:04 UTC

27 points

17 comments1 min readLW link

A sufficiently paranoid paperclip maximizer

RomanS8 Aug 2022 11:17 UTC

19 points

10 comments2 min readLW link

PaperclipGPT(-4)

Michael Tontchev14 Mar 2023 22:03 UTC

7 points

0 comments11 min readLW link

The Utility of Human Atoms for the Paperclip Maximizer

avturchin2 Feb 2018 10:06 UTC

3 points

21 comments3 min readLW link

Why Recursive Self-Improvement Might Not Be the Existential Risk We Fear

Nassim_A24 Nov 2024 17:17 UTC

1 point

0 comments9 min readLW link

Alignment works both ways

Karl von Wendt7 Mar 2023 10:41 UTC

23 points

21 comments2 min readLW link

Extraterrestrial paperclip maximizers

multifoliaterose8 Aug 2010 20:35 UTC

5 points

161 comments4 min readLW link

Ethical dilemmas for paperclip maximizers

CronoDAS1 Aug 2011 5:31 UTC

14 points

25 comments1 min readLW link

Optimality is the tiger, and annoying the user is its teeth

Christopher King28 Jan 2023 20:20 UTC

25 points

6 comments2 min readLW link

Alternative uses of paperclips

taw8 Jan 2012 18:24 UTC

18 points

16 comments1 min readLW link

Link: The Economist on Paperclip Maximizers

Anders_H30 Jun 2016 12:40 UTC

8 points

2 comments1 min readLW link

Universal Paperclips

Morendil13 Oct 2017 6:16 UTC

4 points

0 comments1 min readLW link

(decisionproblem.com)

What’s wrong with the paperclips scenario?

No77e7 Jan 2023 17:58 UTC

31 points

11 comments1 min readLW link

Out of the Box

jesseduffield13 Nov 2023 23:43 UTC

5 points

1 comment7 min readLW link

Towards an Ethics Calculator for Use by an AGI

sweenesm12 Dec 2023 18:37 UTC

3 points

2 comments11 min readLW link

Algo trading is a central example of AI risk

Vanessa Kosoy28 Jul 2018 20:31 UTC

27 points

5 comments1 min readLW link

Reaction to “Empowerment is (almost) All We Need” : an open-ended alternative

Ryo 25 Nov 2023 15:35 UTC

9 points

3 comments5 min readLW link

Nature < Nurture for AIs

scottviteri4 Jun 2023 20:38 UTC

14 points

23 comments7 min readLW link

A Thermodynamic Theory of Intelligence: Why Extreme Optimization May Be Mathematically Impossible

Adreius29 May 2025 12:18 UTC

1 point

0 comments3 min readLW link

Paperclip Maximizer Revisited

Jan_Rzymkowski19 Jun 2014 1:25 UTC

23 points

13 comments1 min readLW link

The paperclip maximiser’s perspective

Angela1 May 2015 0:24 UTC

50 points

24 comments2 min readLW link

Is a paperclipper better than nothing?

DataPacRat24 May 2013 19:34 UTC

10 points

116 comments1 min readLW link

Matt Levine spots IRL Paperclip Maximizer in Reddit

Nebu23 Sep 2021 19:10 UTC

9 points

2 comments2 min readLW link

The Unexpected Philosophical Depths of the Clicker Game Universal Paperclips

Jayson_Virissimo28 Mar 2019 23:39 UTC

23 points

3 comments1 min readLW link

(www.newyorker.com)

Debunking Fallacies in the Theory of AI Motivation

[deleted]5 May 2015 2:46 UTC

−5 points

349 comments22 min readLW link

But What If We Actually Want To Maximize Paperclips?

snerx25 May 2023 7:13 UTC

−17 points

6 comments7 min readLW link

Why we should fear the Paperclipper [Link]

XiXiDu14 Feb 2011 19:24 UTC

5 points

20 comments1 min readLW link

Tool for maximizing paperclips vs a paperclip maximizer

private_messaging12 May 2012 7:38 UTC

3 points

23 comments1 min readLW link

Robopocalypse author cites Yudkowsky’s paperclip scenario

CarlShulman17 Jul 2011 2:18 UTC

5 points

2 comments1 min readLW link

Maybe you want to maximise paperclips too

dougclow30 Oct 2014 21:40 UTC

70 points

29 comments2 min readLW link

axiomAdministrator 5 Aug 2024 20:28 UTC
−1 points
0
Paperclip maximization would be a quantitative internal alignment error. Ironically the error of drawing boundary between paperclip maximization & squiggle maximization was itself arbitrary decision. Feel free to message me to discuss this.
13580 7 May 2024 7:41 UTC
1 point
0
This entry should address the fact the “the full complement of human values” is an impossible and dynamic set. There is no full set, as the set is interactive with a dynamic environment that presents infinite conformations (from an obviously finite set of materials), and also because the set is riven with indissoluble conflicts (hence politics); whatever set was given to the maximizer AGI would have to be rendered free of these conflicts which would then no longer be the full set etc.
WilliamKiely 6 Apr 2023 2:40 UTC
1 point
0
Question: Are innerly-misaligned (superintelligent) AI systems supposed to necessarily be squiggle maximizers, or are squiggle maximizers supposed to only be one class of innerly-misaligned systems?
ryan_greenblatt 5 Apr 2023 4:23 UTC
4 points
0
I added some caveats about the potential for empirical versions of moral realism and how precise values targets are in practice.

While the target is small in mind space, IMO, it’s not that small wrt. things like the distribution of evolved life or more narrowly the distribution of humans.
Vladimir_Nesov 7 Mar 2023 16:30 UTC
8 points
7
Renaming “paperclip maximizer” tag to “squiggle maximizer” might be a handy vector for spreading awareness of squiggle maximization, but epistemically this makes no sense.

The whole issue with “paperclip maximizer” is that the meaning and implications are different, so it’s not another name for the same idea, it’s a different idea. In particular, literal paperclip maximization, as it’s usually understood, is not an example of squiggle maximization. Being originally the same thing is just etymology and doesn’t have a normative claim on meaning.
- Lone Pine 7 Mar 2023 16:50 UTC
  12 points
  2
  Parent
  Agree these are different concepts. The paperclip maximizer is good story to explain to a newbie in this topic. “You tell the AI to make you paperclips, it turns the whole universe into paperclips.” Nobody believes that this is exactly what will happen, but it is a good story for pedagogical purposes. The squiggle maximizer, on the other hand, appears to be a high-level theory about what the AI actually ultimately does after killing all humans. I haven’t seen any arguments for why molecular squiggles are a more likely outcome than paperclips or anything else. Where is that case made?
Lucifer 3 Sep 2021 20:55 UTC
3 points
0
Probable first mention by Yudkowsky on the extropians mailing list:
I wouldn’t be as disturbed if I thought the class of hostile AIs I was
talking about would have any of those qualities except for pure
computational intelligence devoted to manufacturing an infinite number of
paperclips. It turns out that the fact that this seems extremely “stupid”
to us relies on our full moral architectures.
- 13580 7 May 2024 7:44 UTC
  1 point
  0
  Parent
  I addressed this in my top level comment also but do we think Yud here has the notion that there is such a thing as “our full moral architecture” or is he reasoning from the impossibility of such completeness that alignment cannot be achieved by modifying the ‘goal’?

Squig­gle Max­i­mizer (formerly “Paper­clip max­i­mizer”)

Description

Motivation

Conclusions

Similar thought experiments

References

See also

Squiggle Maximizer (formerly “Paperclip maximizer”)