AI, Alignment, and Ethics

22 Nov 2023 5:54 UTC

Posts on a topic I’ve been thinking about for fifteen years: the intersection of AI, Alignment, and Ethics.

We spend a lot of time on Less Wrong/The Alignment Forum discussing “how can we manage to aim an AI at a goal?” That’s a vital question, one that we as a civilization urgently need to solve — but as soon as we’ve done so, we’re going to need to pick what to aim it at. Unfortunately in some plausible scenarios we only get about one shot at that, or at least at some important framing aspects of it. Our default answer, since about 2010, has been “something like Utilitarianism or Coherent Extrapolated Volition, let’s not start arguing over political details”. I think those are actually fairly good answers as far as they go, but they don’t settle some very basic not-exactly-political decisions like “the utility or CEV of what set of entities, exactly?” In Ethics, figuring out how to make a decision like this in anything like a rational, principled way is a surprisingly challenging question, since you can’t use any ethical system or guideline to direct your reasoning without immediately generating a tautology. So we need to base this process on something outside Ethical Philosophy.

I’m concerned that this is a decision we may urgently need to make more progress on, yet it has been intentionally rather ignored on Less Wrong/the Alignment Forum for well over a decade. If we don’t make good progress on it, sooner or later some existing human power structure or political process may be in a position to impose their current answer via AGI, quite possibly for the rest of human history, and likely in a less-than-rational way. I think that I have made some progress on these questions, both on deconfusing how one can make such decisions and what to ground them on, such as Evolutionary Psychology and Sociology, and then on applying that to the definitional question of which set of entities AI should be optimizing on behalf of. Some of my results were rather counterintuitive to me when I first reached them.

Even if I’m wrong or still confused (sadly an all-too-common situation when trying to think about Ethics), I sincerely hope that this will at least restart the discussion.

A Sense of Fairness: Deconfusing Ethics: How to make reasoned design decisions between ethical systems in the context of a particular human society, the importance of evolution and existential risks, and an analysis of what this means for the role of aligned AIs in society.

AIs as Economic Agents: Some more consequences on the role of aligned AIs, now in the economic sphere, and why both corporations and non-profits present a potential problem.

Uploading: Humans aren’t well aligned to other humans, and this has dangerous implications for uploaded digital humans.

A Moral Case for Evolved-Sapience-Chauvinism: A possible principled answer a society could use for “which set of entities should AI be optimizing on behalf of?”

Moral Value for Sentient Animals? Alas, Not Yet: It is incredibly hard to further extend the set of entities AI is optimizing on behalf of to all sentient animals, so, sadly, for now we need to take a more pragmatic approach to animal rights.

The Mutable Values Problem in Value Learning and CEV: The human values of a society can change over time, dramatically so given genetic engineering and cyborging, and our AI will inevitably have a great deal of influence in this process. This makes the evolution of the entire system unstably under-determined. I explore this problem in detail, and show that it will always be a computationally-intractable dynamic system, so unpredictable over the long term. I outline and critique various obvious approaches to reducing this instability (with illustrative explorations of the ethics of psychopathy and war), and tentatively suggest the outline of a possible stabilizing solution, one that doesn’t simply impose our current values for the rest of history. Interestingly this has rather different consequences for genetic engineering versus cyborging.

Evolution and Ethics: How evolution solves the “Ought-from-Is” problem in ethical philosophy, and how that justifies the privileged role given to evolutionary arguments and to evolved beings in several of the preceding posts.