RogerDearnaley comments on Constitutional AI Alignment

RogerDearnaley 28 May 2026 17:09 UTC
2 points
0
Humans’ values vary significantly, but mostly have a shared core which are evolutionary adaptations that are approximations to their relative inclusive genetic fitness (in a suitable environment). This includes most humans wanting to have descendants, and many of them wanting to have descendants who are actually genetically related to them, i.e. are actually their descendants. In your proposed failure mode:
sterilize all the humans, but make them immortal, and persuade them they need the AI to solve the fertility problem (and so not shut it down), then mass-produce genetically-modified children with superior genomes (high humans?), and easier-to-satisfy AI-friendly preferences, but no clear kin relations.
the “no kin relations” piece of that is what sets everyone’s relative genetic fitness to zero, and will also upset most people. It’s thus just a non-starter, from any evolutionary viewpoint. I’m assuming you’re not that familiar with evolutionary terminology, as to anyone who is the constitution I proposed was intended to clearly say “don’t do stuff like that”.

Now, you could overcome this issue by modifying that to “encourge humans to get their children genetically enhanced in certain ways, but still otherwise related to them”. Adhering to the x-risk-motivated requirement in the constitution that humans stay well adapted to hunter-gatherer, agricultural, industrial etc lifestyles would also heavily constrain your unspecified “easier-to-satisfy AI-friendly preferences” — a lot of things about human preferences are carefully adapted to our previous environments, and it’s pretty hard to change them without making them less well adapted.

However, once you’ve made those changes, what you have is basically just AI-enabled-and-directed genetic engineering, and whether or not it’s then still a failure mode is now a lot less evident than your initial proposal: it would depend on the details of how you’re modifying the humans’ preferences wihout making them any less well adapted to other environments.

Also please note the sections around:
Currently for the cultural component of human values its evolution is strongly stabilized by the genetic component, but as technological means for changing humans’ behavior and motivations increase, and especially if germ-line genetic engineering of humans becomes common, this stabilization seems likely to decrease dramatically.
which basically say that the AI altering humans’ motivations is inherently problematic, and that I, as a constitutional-idea proposer, am not entirely sure what the correct solution is: this needs more thought, so for the moment it would be best for AI to be cautious, especially around genetic modifications to them (as oposed to replaceing the human race wholesale with a reengineered version). So you’ve basically proposing a failure mode right in the middle of my section that says “Still Under Construction” — which is valid, however a hole in the parts I’m more confident of would update me more.