Sanctuary for Humans

TL;DR: If we succeed at AI safety, humans will probably decide the future of the universe, and we currently have strong self-preservation incentives to choose a future populated by humans. If we committed to provide sanctuary for all currently alive humans, this would make our decision making process less biased at the cost of one planet or solar system, which I think is a good trade.

I think these two scenarios are the main ways for the future to go:

  1. The universe is populated by biological humans very similar to the ones alive today and not much else (a “star trek” type future)

  2. The universe is populated by something very weird (digital humans, posthumans, computronium, hedonium, “paperclips”)

It is possible that a specific version of the second scenario is “better” (whatever that means) than the first scenario. Getting this right is extremely important, as the entire universe literally depends on it.

I think currently alive humans are unlikely to let ourselves be replaced (or modified) even if on some level we think (or think we should think) that this would be better, as we have strong incentives to converge to value systems that lead to us surviving. It’s hard to reason with a gun pointed at your head. Additionally, the people who get to shape the future might be eager to choose a future without humans for no good reason, making us averse to that type of reasoning.

One way to remove most of the self-preservation constraint is to commit to provide sanctuary to currently alive humans indefinitely. In a post-AGI world, we might wall off the Earth, or the Solar system, and reserve it forever to be for humans and humans only. This way, we can reason about how to shape the rest of the universe without our self-preservation mechanisms kicking in as much. Without the metaphorical gun pointed at humanity’s head, we are likely to reason more clearly about how to shape the future.

In the worst case, preserving the Earth or the Solar system as a sanctuary would cause an extremely small fraction of the universe to be suboptimal according to the values which shape the rest of the universe. I think this is a worthy tradeoff, as being “closer to the bullseye” in regards to what happens to the rest of the universe is worth much more than one planet or star system. The lightcone could contain a septillion star systems. One star system being reserved for humans is a small sacrifice.

Appendix: Biological human future vs weird future

We can imagine a world very similar to the human-inhabited world, but with digital human minds in a simulated environment, where the humans live equally good or better lives (assuming the lives are net-positive in both worlds).

If the following are true:

  1. Substrate-independence (humans in virtual worlds are similarly valuable per human as humans in the physical world)

  2. Humans housed in virtual worlds are less resource-intensive per person

  3. The “goodness of the universe” is sensitive to the number of people in it

Then the future inhabited by virtual humans could be thousands or millions of times better than the one inhabited by biological humans, depending on how much “cheaper” it is for a human to live digitally than biologically. It’s extremely important to pick the right one.