Aligned AI is dual use technology

Humans are mostly selfish most of the time. Yes, many of us dislike hurting others, are reliable friends and trading partners, and care genuinely about those we have personal relationships with. Despite this, spontaneous strategic altruism towards strangers is extremely rare. The median American directs exactly $0 to global poverty interventions, and that is a true statement regardless of whether you limit it to the Americans that make ten, fifty, a hundred, or a thousand times as much money as Nigerians.

Some people hope that with enough tech development we will reach a “post-scarcity” regime where people have so much money that there is a global commons of resources people can access largely to their hearts’ content. But this has always sounded to me like a 1023 AD peasant hoping that in 2023, the French will be so rich that no one outside France will die of a preventable disease. There will always be more for people with money to consume; even in the limits of global wealth, the free energy or resources that a person could devote to helping poor people or defending them from abuse could also be devoted to extending a personal lifespan before heat death.

So in keeping with this long tradition of human selfishness, it sounds likely that if we succeed at aligning AI, the vast, vast majority of its output will get directed toward satisfying the preferences and values of the people controlling it (or possessing leverage over its continued operation) - not the “CEV of all humans”, let alone the “CEV of all extant moral persons”. A person deciding to use their GPUs to optimize for humanity’s betterment would be the equivalent of a person hiring a maid for humanity instead of their own home; it’s simply not what you expect people to do in practice, effective altruists aside. In a “polytheistic” future where at least a dozen people share large amounts of control, I expect wielding this control will involve:

  • Extracting any significant extant resources from the remainder of people vulnerable to manipulation or coercion.

  • Creating new people of moral value to serve as romantic partners, friends, and social subordinates.

  • Getting admiration, prestige, and respect from legacy humans, possibly to extreme degrees, possibly in ways we would dislike upon reflection.

  • Engineering new worlds where they can “help” or “save” others, depending on the operational details of their ethics.

In this scenario the vast majority of beings of moral worth spread across the galaxy are not the people the AIs are working to help. They’re the things that surround those people, because those oligarchs enjoy their company. And it doesn’t take a genius to see why that might be worse overall than just paperclipping this corner of the cosmos, depending on who’s in charge and what their preferences for “company” are, how they react to extreme power, or how much they care about the internal psychology of their peers.