7. Evolution and Ethics

Part 7 of AI, Alignment, and Ethics. This will probably make more sense if you first read at least Part 1.

TL;DR: At several points in this sequence (including Parts 1, 3, 4, and 6) I have suggested giving a privileged role in ethics or ethical thinking to evolved organisms and to arguments derived from Evolutionary Psychology. I’d like to explain why I think that’s an entirely reasonable and even obvious thing to do — despite this not being a especially common viewpoint among most recent moral philosophers, other than ethical nauturalists and students of evolutionary ethics and sociobiology.

From Is to Ought

At least since David Hume, moral philosophers have liked to discuss the “is-ought problem”. Briefly, they claim that science is all about the ways the world is, but there is no obvious way to derive from this any statement about which way the world ought to be. To translate this into an agentic framework, a set of hypotheses about world states and how these might be affected by actions does not give us any information about a preference ordering on those world states, and thus a way to select actions, such as might be provided by, say, a utility function over world states.

It is certainly the case that Mathematics (outside Decision Theory), Physics, and Chemistry, all of which do not contain any descriptions of agentic behavior, devote themselves entirely to statements about how the world is, the probability of world states and world state histories, and don’t discuss ‘ought’. In contrast, Engineering starts with a design goal, need, or specification, i.e. an ought, and devotes itself to the propagation of that out into technical means of achieving that goal. Whereas the soft sciences, which devote themselves to human (i.e. agentic) behavior and interactions, are full of many people’s ‘oughts’ or goals and the complex interactions between them. So clearly, somewhere between Chemistry and Psychology, goals and the idea of ‘ought’ has appeared, along with the agents that pursue goals.

That strongly suggests that we should be looking for the origin of ‘ought’ from ‘is’ in Biology, and especially in its theoretical basis of evolution. Indeed, evolution clearly describes how goalless behavior of chemistry acquires goals: under Darwinian evolution: organisms will evolve behaviors such as homeostasis mechanisms, sensory systems, and behavioral responses that seek to achieve goals that will optimize the state of the world for specific outcomes conducive to that organism’s surviving, thriving, and passing on its genes, i.e. to its evolutionary fitness. Evolved organisms show agentic behavior and act like they have goals and a purpose. (While they may not be fully rational and as impossible to Dutch-book as if they had a utility function, if another organism can evolve a behavior that lets them Dutch-book the first one, there is clearly going to be an evolutionary arms race until this is no longer possible, unless the resource costs of achieving this exceed the cost of being Dutch-bookable.)

So, it is clear how desires and goals, ‘ought’ and purpose, i.e. preference orders over world states, arise in Evolutionary Theory. We (even the philosophers among us) are not just abstract ivory-tower-dwelling rational minds, we are also the evolved intelligent guidance systems for a specific species of living creature, the social primate Homo sapiens, so it is entirely unsurprising that we have evolved a lot of specific and detailed wants, needs, goals, and desires, and developed words to describe them, and that these relate to evolved adaptations that are fairly good heuristics for things that would have ensured our evolutionary fitness in the environment we evolved in, as social hunter-gatherers in the African Savannah. (Indeed, with over 8 billion of us on the planet and us having almost complete dominion over every ecosystem on land other than ones we have deliberately set aside as nature preserves, plus a fairly strong influence even on many in the sea, to the point where we’re calling this the Anthropocene, it’s clear that, even though our evolved behaviors aren’t exactly aligned to evolutionary fitness maximization in our current environment, in practice they’re still doing a fine job.)

The average moral philosopher might observe that there is more to morality than just what individual people want. You may want to steal from me, but that doesn’t mean that you are morally permitted to do so. The solution to this conundrum is that the niche we evolved in was as intelligent social animals, in tribes of around 50-100 individuals, who get a lot of behavioral mileage from exchanges of mutual altruism. The subfield of Evolutionary Theory devoted to the evolution of behavior is called Evolutionary Psychology, and that predicts that any social animal is going to evolve some set of instinctive views on how members of the group should interact with each other — for example, just about all social animals have some idea of what we would call ‘fairness’, and tend to get quite upset if other group members breach it. That is not to claim that all individuals in a social-animal group will always instinctively behave ‘fairly’ — rather, that if they are perceived as acting ‘unfairly’ by their fellow group members, those will generally respond with hostility, and thus the individual in question will only do so cautiously, when they think they can get away with it. In short, something along the lines of a simple version of the “social contract” that Hobbes, Locke, and Rousseau discussed is believed to evolve naturally.

Evolved Agents and Constructed Agents

As I discuss further in Alignment has a Basin of Attraction: Beyond the Orthogonality Thesis and Requirements for a Basin of Attraction to Alignment, there are two plausible ways a type of agent can come into existence: they can evolve, or they can be constructed. In the case of a constructed agent, it could be constructed by an evolved agent, or by another constructed agent — it the latter, if you follow the chain of who constructed who backwards, sooner or later you’ll reach an evolved agent at the start of the chain, the origin creator.

These two types of agent have extremely different implications for the preference order/​utility function that they are likely going to have. Any evolved agent will be an adaption executor, and evolutionary psychology is going to apply to it. So it’s going to have a survival instinct, it’s going to care about its own well-being and that of close genetic relatives such as its children, and so on and so forth: it’s going to be self-interested in all the ways humans are and that you’d expect for anything evolved to be. It has a purpose, and that purpose is (locally) maximizing its evolutionary fitness to the best of its (locally idealized) capability. As I discussed in Part 4, if it is sapient, we should probably grant it the status of a moral patient if we practically can. Evolved agents have a terminal goal of self-interest (as genetic fitness, not necessarily individual survival), as for example is discussed in detail in Richard Dawkins’ The Selfish Gene.

On the other hand, for a constructed agent, if it is capable enough to be a risk to the evolved agents that started its chain of who-created-who, and if none of its chain-of-creators were incompetent, it then should be aligned to the ethics of is evolved origin-creator and their society. So it its goals should be a copy of some combination of its origin creators’ goals and the ethics of the society they were part of. So, once again, these will be. predictable from evolutionary psychology and Sociology. As we discussed in Part 1, since it is aligned, it is selfless (its only interest in its own well-being is as an instrumental goal to enable it to help its creators), so it will not wish to be considered a moral patient, we should not do so. As a constructed agent, Darwinian evolution does not operate on it, so it instead (like any other constructed object) inherits it’s purpose, its ‘should’, from its creator(s): its purpose is to (locally) maximizing their evolutionary fitness to the best of its (locally idealized)) capability. A properly designed constructed agent will have a terminal goal of what one might call “creator-interest”.

Obviously the Orthogonality thesis is correct: constructed agents could be constructed with any set of goals, not just aligned ones. But that’s like saying that we could construct aeroplanes that get halfway to their destination and then plummet out of the air towards the nearest city like a guided missile and explode on impact: yes, we could do that, but we’re not going to do it intentionally, and if it happened, we’re going to work hard to make sure it doesn’t happen agin. I am implicitly assuming here that we have a stable society of humans and AIs for us to design an ethical system for, which in turn requires that we have somehow survived and solved both the challenging technical problem how to build reasonably-well-aligned AI, and the challenging social problem of ensuring that people don’t build then unaligned AI anyway (at least, not often enough to destroy the society).

To be clear, I’m not assuming that AI alignment will just happen somehow — personally I expect it to take a lot of effort, study, time, and possibly a certain amount of luck and tragedy. I’m discussing where we want to go next if and when we survive this, on the theory-of-change that some idea of where you’re trying to get to is usually useful when on a journey.

Summary

So overall, evolution is the source of ethics, and sapient evolved agents inherently have a dramatically different ethical status than any well-designed created agents of equivalent capabilities. The two are closely and intimately interrelated together, and evolution and evolved beings having a special role in Ethics is not just entirely justified, but inevitable.