Superintelligence 8: Cognitive superpowers

This is part of a weekly reading group on Nick Bostrom’s book, Superintelligence. For more information about the group, and an index of posts so far see the announcement post. For the schedule of future topics, see MIRI’s reading guide.


Welcome. This week we discuss the eighth section in the reading guide: Cognitive Superpowers. This corresponds to Chapter 6.

This post summarizes the section, and offers a few relevant notes, and ideas for further investigation. Some of my own thoughts and questions for discussion are in the comments.

There is no need to proceed in order through this post, or to look at everything. Feel free to jump straight to the discussion. Where applicable and I remember, page numbers indicate the rough part of the chapter that is most related (not necessarily that the chapter is being cited for the specific claim).

Reading: Chapter 6


Summary

  1. AI agents might have very different skill profiles.

  2. AI with some narrow skills could produce a variety of other skills. e.g. strong AI research skills might allow an AI to build its own social skills.

  3. ‘Superpowers’ that might be particularly important for an AI that wants to take control of the world include:

    1. Intelligence amplification: for bootstrapping its own intelligence

    2. Strategizing: for achieving distant goals and overcoming opposition

    3. Social manipulation: for escaping human control, getting support, and encouraging desired courses of action

    4. Hacking: for stealing hardware, money and infrastructure; for escaping human control

    5. Technology research: for creating military force, surveillance, or space transport

    6. Economic productivity: for making money to spend on taking over the world

  4. These ‘superpowers’ are relative to other nearby agents; Bostrom means them to be super only if they substantially exceed the combined capabilities of the rest of the global civilization.

  5. A takeover scenario might go like this:

    1. Pre-criticality: researchers make a seed-AI, which becomes increasingly helpful at improving itself

    2. Recursive self-improvement: seed-AI becomes main force for improving itself and brings about an intelligence explosion. It perhaps develops all of the superpowers it didn’t already have.

    3. Covert preparation: the AI makes up a robust long term plan, pretends to be nice, and escapes from human control if need be.

    4. Overt implementation: the AI goes ahead with its plan, perhaps killing the humans at the outset to remove opposition.

  6. Wise Singleton Sustainability Threshold (WSST): a capability set exceeds this iff a wise singleton with that capability set would be able to take over much of the accessible universe. ‘Wise’ here means being patient and savvy about existential risks, ‘singleton’ means being internally coordinated and having no opponents.

  7. The WSST appears to be low. e.g. our own intelligence is sufficient, as would some skill sets be that were strong in only a few narrow areas.

  8. The cosmic endowment (what we could do with the matter and energy that might ultimately be available if we colonized space) is at least about 10^85 computational operations. This is equivalent to 10^58 emulated human lives.

Another view

Bostrom starts the chapter claiming that humans’ dominant position comes from their slightly expanded set of cognitive functions relative to other animals. Computer scientist Ernest Davis criticizes this claim in a recent review of Superintelligence:

The assumption that a large gain in intelligence would necessarily entail a correspondingly large increase in power. Bostrom points out that what he calls a comparatively small increase in brain size and complexity resulted in mankind’s spectacular gain in physical power. But he ignores the fact that the much larger increase in brain size and complexity that preceded the appearance in man had no such effect. He says that the relation of a supercomputer to man will be like the relation of a man to a mouse, rather than like the relation of Einstein to the rest of us; but what if it is like the relation of an elephant to a mouse?

Notes

1. How does this model of AIs with unique bundles of ‘superpowers’ fit with the story we have heard so far?
Earlier it seemed we were just talking about a single level of intelligence that was growing, whereas now it seems there are potentially many distinct intelligent skills that might need to be developed. Does our argument so far still work out, if an agent has a variety of different sorts of intelligence to be improving?
If you recall, the main argument so far was that AI might be easy (have ‘low recalcitrance’) mostly because there is a lot of hardware and content sitting around and algorithms might randomly happen to be easy. Then more effort (‘optimization power’) will be spent on AI as it became evidently important. Then much more effort again will be spent when the AI becomes a large source of labor itself. This was all taken to suggest that AI might progress very fast from human-level to superhuman level, which suggests that one AI agent might get far ahead before anyone else catches up, suggesting that one AI might seize power.
It seems to me that this argument works a bit less well with a cluster of skills than one central important skill, though it is a matter of degree and the argument was only qualitative to begin with.
It is less likely that AI algorithms will happen to be especially easy if a lot of different algorithms are needed. Also, if different cognitive skills are developed at somewhat different times, then it’s harder to imagine a sudden jump when a fully capable AI suddenly reads the whole internet or becomes a hugely more valuable use for hardware than anything being run already. Then if there are many different projects needed for making an AI smarter in different ways, the extra effort (brought first by human optimism and then by self-improving AI) must be divided between those projects. If a giant AI could dedicate its efforts to improving some central feature that would improve all of its future efforts (like ‘intelligence’), then it would do much better than if it has to devote one one thousandth of its efforts to each of a thousand different sub-skills, each of which is only relevant for a few niche circumstances. Overall it seems AI must progress slower if its success is driven by more distinct dedicated skills.
2. The ‘intelligence amplification’ superpower seems much more important than the others. It directly leads to an intelligence explosion—a key reason we have seen so far to expect anything exciting to happen with AI—while several others just allow one-off grabbing of resources (e.g. social manipulation and hacking). Note that this suggests an intelligence explosion could happen with only this superpower, well before an AI appeared to be human-level.
3. Box 6 outlines a specific AI takeover scenario. A bunch of LessWrongers thought about other possibilities in this post.
4. Bostrom mentions that social manipulation could allow a ‘boxed’ AI to persuade its gatekeepers to let it out. Some humans have tried to demonstrate that this is a serious hazard by simulating the interaction using only an intelligent human in the place of the AI, in the ‘AI box experiment’. Apparently in both ‘official’ efforts the AI escaped, though there have been other trials where the human won.
5. How to measure intelligence
Bostrom pointed to some efforts to design more general intelligence metrics:
Legg: intelligence is measured in terms of reward in all reward-summable environments, weighted by complexity of the environment.
Hibbard: intelligence is measured in terms of the hardest environment you can pass, in a hierarchy of increasingly hard environments
Dowe and Hernández-Orallo have several papers on the topic, and summarize some other efforts. I haven’t looked at them enough to summarize.
The Turing Test is the most famous test of machine intelligence. However it only tests whether a machine is at a specific level so isn’t great for fine-grained measurement of other levels of intelligence. It is also often misunderstood to measure just whether a machine can conduct a normal chat like a human, rather than whether it can respond as capably as a human to anything you can ask it.
For some specific cognitive skills, there are other measures already. e.g. ‘economic productivity’ can be measured crudely in terms of profits made. Others seem like they could be developed without too much difficulty. e.g. Social manipulation could be measured in terms of probabilities of succeeding at manipulation tasks—this test doesn’t exist as far as I know, but it doesn’t seem prohibitively difficult to make.
6. Will we be able to colonize the stars?
Nick Beckstead looked into it recently. Summary: probably.

In-depth investigations

If you are particularly interested in these topics, and want to do further research, these are a few plausible directions, almost entirely taken from Luke Muehlhauser’s list, without my looking into them further.

  1. Try to develop metrics for specific important cognitive abilities, including general intelligence. Build on the ideas of Legg, Yudkowsky, Goertzel, Hernandez-Orallo & Dowe, etc.

  2. What is the construct validity of non-anthropomorphic intelligence measures? In other words, are there convergently instrumental prediction and planning algorithms? E.g. can one tend to get agents that are good at predicting economies but not astronomical events? Or do self-modifying agents in a competitive environment tend to converge toward a specific stable attractor in general intelligence space?

  3. Scenario analysis: What are some concrete AI paths to influence over world affairs? See project guide here.

  4. How much of humanity’s cosmic endowment can we plausibly make productive use of given AGI? One way to explore this question is via various follow-ups to Armstrong & Sandberg (2013). Sandberg lists several potential follow-up studies in this interview, for example (1) get more precise measurements of the distribution of large particles in interstellar and intergalactic space, and (2) analyze how well different long-term storable energy sources scale. See Beckstead (2014).

    If you are interested in anything like this, you might want to mention it in the comments, and see whether other people have useful thoughts.

    How to proceed

    This has been a collection of notes on the chapter. The most important part of the reading group though is discussion, which is in the comments section. I pose some questions for you there, and I invite you to add your own. Please remember that this group contains a variety of levels of expertise: if a line of discussion seems too basic or too incomprehensible, look around for one that suits you better!

    Next week, we will talk about the orthogonality of intelligence and goals, section 9. To prepare, read The relation between intelligence and motivation from Chapter 7. The discussion will go live at 6pm Pacific time next Monday November 10. Sign up to be notified here.