Kaarel comments on Moral Extension Risk

Kaarel 26 Mar 2026 4:24 UTC
7 points
0
It indeed seems deeply unnatural for a very smart AI to look at the human world from the outside, be able to replace it with whatever, and be like: “no, i’m not going to use these atoms and this negentropy/energy for anything else — this human world that is here by default is the best thing that could be here; in fact, I will make sure it has a lot of resources to flourish in the future”. It seems [deeply unnatural]/[extremely sharp] for anyone to have values like this. I think it’s unlikely that even humanity-after-developing-correctly-for-a-million-years would think like this if it encountered another Earth with a current-humanity-level alternate humanity on it. ^[1]

One approach to tackling this difficulty is to try to somehow make an AI that does this imo deeply unnatural thing anyway. But there is also the following alternative approach: to try to make it so there is not anyone that is judging the human world from the outside like this — i.e., that it’s just the human world judging itself. The judgment “we are cool, we have lots of cool projects going on, and we definitely should avoid killing ourselves” is very natural; in particular, it is much more natural than the judgment the AI looking at the human world from the outside needs to make. I think this alternative path requires banning AGI.

One more alternative approach (that overlaps with the previous one): one can also hope to have humans flourish for a long time without any judgment that humans are very cool directly controlling local decision-making. Instead, we can try to set up local incentives so that goodness/humanness is promoted. This way, humans might be able to flourish even in a “hot mess” world. For this, it is crucial that humans and human institutions remain useful. So, this also requires banning AGI.
1. ↩︎
  Indeed, human civilizations have historically not treated less developed civilizations with much kindness.