Strategic implications of AIs’ ability to coordinate at low cost, for example by merging
It seems likely to me that AIs will be able to coordinate with each other much more easily (i.e., at lower cost and greater scale) than humans currently can, for example by merging into coherent unified agents by combining their utility functions. This has been discussed at least since 2009, but I’m not sure its implications have been widely recognized. In this post I talk about two such implications that occurred to me relatively recently.
I was recently reminded of this quote from Robin Hanson’s Prefer Law To Values:
The later era when robots are vastly more capable than people should be much like the case of choosing a nation in which to retire. In this case we don’t expect to have much in the way of skills to offer, so we mostly care that they are law-abiding enough to respect our property rights. If they use the same law to keep the peace among themselves as they use to keep the peace with us, we could have a long and prosperous future in whatever weird world they conjure. In such a vast rich universe our “retirement income” should buy a comfortable if not central place for humans to watch it all in wonder.
Robin argued that this implies we should work to make it more likely that our current institutions like laws will survive into the AI era. But (aside from the problem that we’re most likely still incurring astronomical waste even if many humans survive “in retirement”), assuming that AIs will have the ability to coordinate amongst themselves by doing something like merging their utility functions, there will be no reason to use laws (much less “the same laws”) to keep peace among themselves. So the first implication is that to the extent that AIs are likely to have this ability, working in the direction Robin suggested would likely be futile.
The second implication is that AI safety/alignment approaches that aim to preserve an AI’s competitiveness must also preserve its ability to coordinate with other AIs, since that is likely an important part of its competitiveness. For example, making an AI corrigible in the sense of allowing a human to shut it (and its successors/subagents) down or change how it functions would seemingly make it impossible for this AI to merge with another AI that is not corrigible, or not corrigible in the same way. (I’ve mentioned this a number of times in previous comments, as a reason why I’m pessimistic about specific approaches, but I’m not sure if others have picked up on it, or agree with it, as a general concern, which partly motivates this post.)
Questions: Do you agree AIs are likely to have the ability to coordinate with each other at low cost? What other implications does this have, especially for our strategies for reducing x-risk?
- Cortés, Pizarro, and Afonso as Precedents for Takeover by 1 Mar 2020 3:49 UTC; 180 points) (
- AI Alignment 2018-19 Review by 28 Jan 2020 2:19 UTC; 126 points) (
- 2019 Review: Voting Results! by 1 Feb 2021 3:10 UTC; 99 points) (
- How much EA analysis of AI safety as a cause area exists? by 6 Sep 2019 11:15 UTC; 94 points) (EA Forum;
- AGI will drastically increase economies of scale by 7 Jun 2019 23:17 UTC; 65 points) (
- Taxonomy of AI-risk counterarguments by 16 Oct 2023 0:12 UTC; 62 points) (
- Clarifications about structural risk from AI by 18 Jan 2022 12:57 UTC; 41 points) (EA Forum;
- Transformative AI issues (not just misalignment): an overview by 6 Jan 2023 2:19 UTC; 36 points) (EA Forum;
- Transformative AI issues (not just misalignment): an overview by 5 Jan 2023 20:20 UTC; 34 points) (
- Commitment and credibility in multipolar AI scenarios by 4 Dec 2020 18:48 UTC; 31 points) (
- Cortés, Pizarro, and Afonso as Precedents for Takeover by 2 Mar 2020 12:25 UTC; 27 points) (EA Forum;
- Positive visions for AI by 23 Jul 2024 20:15 UTC; 26 points) (
- Why should we expect AIs to coordinate well? by 14 Feb 2023 15:50 UTC; 25 points) (
- 5 May 2021 1:57 UTC; 25 points) 's comment on April 15, 2040 by (
- Positive visions for AI by 23 Jul 2024 20:15 UTC; 21 points) (EA Forum;
- 21 Jun 2019 15:34 UTC; 18 points) 's comment on A case for strategy research: what it is and why we need more of it by (
- 31 Aug 2019 13:25 UTC; 14 points) 's comment on Towards a mechanistic understanding of corrigibility by (
- [AN #76]: How dataset size affects robustness, and benchmarking safe exploration by measuring constraint violations by 4 Dec 2019 18:10 UTC; 14 points) (
- Cooperation with and between AGI\’s by 7 Jul 2022 16:45 UTC; 10 points) (
- 4 Mar 2022 0:06 UTC; 6 points) 's comment on March 2022 Welcome & Open Thread by (
- 26 Mar 2021 15:11 UTC; 6 points) 's comment on My AGI Threat Model: Misaligned Model-Based RL Agent by (
- 30 Dec 2020 3:30 UTC; -2 points) 's comment on Review Voting Thread by (
Nominating along with its sequel. Make important point about competitiveness and coordination.
While I think this post isn’t the best writeup of this topic I can imagine, I think it makes a really important point quite succinctly, and is one that I have brought up many times in arguments around takeoff speeds and risk scenarios since this post came out.