If one surviving civilization can rescue others, shouldn’t civilizations randomize?

Knight Lee20 May 2025 15:26 UTC

−2 points

AI Decision theory AI Alignment Fieldbuilding

In the comments section of You can, in fact, bamboozle an unaligned AI into sparing your life, both supporters and critics of the idea seemed to agree on two assumptions:

Surviving planetary civilizations have some hope of rescuing planetary civilizations killed by misaligned AI, but they disagree on the best method of rescuing.
The big worry is that there are almost 0 surviving planetary civilizations, because if we’re unlucky, all planetary civilizations will die the same way.

What if to ensure at least some planetary civilizations survive (and hopefully rescue others), each planetary civilization should pick a random strategy?

Maybe if every planetary civilization follows a random strategy, they increase the chance of surviving the singularity, and also increase the chance that the average sentient life in all of existence is happy rather than miserable. It reduces logical risk.

History already is random, but perhaps we could further randomize the strategy we pick.

For example, if the random number generated using Dawson et al’s method (at some prearranged date) is greater than the 95th percentile, we could all randomly choose MIRI’s extremely pessimist strategy, and do whatever Eliezer Yudkowsky and Nate Soares suggest with less arguing and more urgency. If they tell you that your AI lab, working on both capabilities and alignment, is a net negative, then you quit and work on something else. If you are more reluctant to do so, you might insist on the 99th percentile instead.

Does this make sense or am I going insane again?

Total utilitarianism objections

If you are a total utilitarian, and don’t care about how happy the average life is, and only care about the total number of happy lives, then you might say this is a bad idea, since it increases the chance at least some planetary civilizations survive, but reduces the total expected number of happy lives.

However, it also reduces the total expected number of miserable lives. Because if 0 planetary civilizations survive, the number of miserable lives may be huge due to misaligned AI simulating all possible histories. If only a few planetary civilizations survive, they may trade with these misaligned AI (causally or acausally) to greatly reduce suffering, since the misaligned AI only gain a tiny tiny bit by causing astronomical suffering. They only lose a tiny bit of accuracy if they decrease the suffering by 2x.

This idea is only morally bad if you are both a total utilitarian, and only care about happiness (not worrying about suffering). But really, we should have moral uncertainty and value more than one philosophy (total utilitarianism, average utilitarianism, etc.).

Knight Lee20 May 2025 15:26 UTC

−2 points

4 comments1 min readLW link

AI Decision theory AI Alignment Fieldbuilding

Dagon 20 May 2025 17:43 UTC
3 points
0
Can you explain your model for what “survive” and “resurrect” means for a civilization, as opposed to individuals that happen to exist within a civilizational context? Relatedly, what’s your model for a civilization’s decision theory that makes “random strategy” a coherent idea?
My model is that a civilization is an emergent set of behaviors and expectations of individuals that are coexistent in time and space. And I’m not sure your thinking is applicable on that level.
- Knight Lee 20 May 2025 18:18 UTC
  1 point
  0
  Parent
  Oops oh no. I used the wrong word. I meant planetary civilization, e.g. humanity or an alien civilization. Sorry.
  I’ll edit the post to replace “civilization” with “planetary civilization.” Thank you for commenting, you saved me from confusing everyone!
  In the discussion of You can, in fact, bamboozle an unaligned AI into sparing your life, the people from planet 1 can revive the people from another planet which got taken over by a misaligned ASI (planet 2), if that ASI saved the brain states of planet 2′s people before killing them.
  Both the people from planet 1 and the ASI from planet 2 might colonize the stars, expanding further and further until they meet each other. The ASI might sell the brain states of planet 2′s people, to planet 1′s people, so that planet 1′s people can revive planet 2′s people.
  Planet 1′s people agree to this deal because they care about saving people from other planets. The ASI from planet 2 agree to this deal because planet 1′s people might give them a tiny bit more resources for making paperclips.
  This was one out of many ideas, for how one surviving planetary civilization could revive others.
  - Dagon 20 May 2025 19:13 UTC
    3 points
    0
    Parent
    Thanks for the clarification—I’m still a bit unsure if “planetary civilization” is distinct from “the specific set of individuals inhabiting a planet”, and I should admit that I’m highly skeptical of the value (to an AGI or even to other humans) of a specific individual’s brain-state, and I have a lot of trouble following arguments that imply migration or resurrection of more than a few percent of biological intelligences.
    - Knight Lee 20 May 2025 20:29 UTC
      1 point
      0
      Parent
      Sorry, yes a planetary civilization is simply the specific set of individuals inhabiting a planet. I’m not sure what’s the best way to describe that in two words :/
      What I described there was only one out of very many ideas proposed in the discussion of You can, in fact, bamboozle an unaligned AI into sparing your life. The overall idea is that a few surviving civilizations can do a lot of good.
      How valuable a few surviving civilizations are depends on your ontology. If you believe in the many worlds interpretation of quantum mechanics, or believe that the universe is infinitely big, then there are infinite exact copies of the Earth. Even if only 0.1% of Earths were saved, there will still be infinite copies of future you alive, but at 0.1% the density.
      The planetary civilization saving Earth may have immense resources in the post singularity world. With millions of years of technological progress, technology will be limited only by the laws of physics. They can expand out close to the speed of light, and control the matter and energy of $10^{22}$ stars. Meanwhile, energy required to simulate all of humanity, using the most efficient computers possible, is probably not much more than running 1 electric car.^[1]
      They could easily simulate 1000 copies of humanity.
      This means for every 1000 identical copies of you, you might have 999 dying, and one surviving but duplicated 1000 times.
      If you don’t care about personal survival but whether the average sentient life in all of existence is happy or miserable, then it’s also good for planetary civilizations to randomize their strategies, to ensure at least a few survive, and use their immense resources to create far more happy lives than all the miserable lives from pre-singularity times.
      ^
      The human brain uses 20 watts of energy, but is very inefficient. Each neuron firing uses $6 \times 10^{8}$ ATP molecules. If a simulated neuron firing only uses the energy equivalent of 60 ATP molecules, then it would be $10^{7}$ times more efficient, and 8 billion people will only use 16,000 watts, similar to an electric car.