Raemon comments on Notes on fatalities from AI takeover

Raemon 23 Sep 2025 18:38 UTC
6 points
2
Other entities that care trade with the AI (or with entities that the AI trades with) to keep humans alive. This includes acausal trade, ECL, simulations/anthropic capture (to be able to effectively acausally trade with decision theory naive AIs), and causal trade with aliens that the AI ends up encountering. It’s unclear how simulations/anthropic capture works out. It seems plausible that this happens in a way which doesn’t result in all relevant (acausal) trades happening. It seems plausible that other entities (including humans in other Everett branches) pretty universally don’t want to bail humans (in this branch) out because they have better things to spend their resources on, but even a tiny fraction of entities spending some non-trivial fraction of resources could suffice. This depends on there being some beings with power who care about things like human survival despite alignment difficulties, but this feels very likely to me (even if alignment is very hard, there may well be more competent aliens or AIs who care a small amount about this sort of thing). Note that this requires the AI to care about at least some of these mechanisms which isn’t obvious.
How much of your plausibility-of-caring-slightly routes through this, vs the “actually just slightly intrinisically nice?” thing.
In the previous discussion, you said:
I’m happy to consider non-consensual uploading to be death and I’m certainly happy to consider “the humans are modified in some way they would find horrifying (at least on reflection)” to be death. I think “the humans are alive in the normal sense of alive” is totally plausible and I expect some humans to be alive in the normal sense of alive in the majority of worlds where AIs takeover.
Making uploads is barely cheaper than literally keeping physical humans alive after AIs have fully solidified their power I think, maybe 0-3 OOMs more expensive or something, so I don’t think non-consensual uploads are that much of the action. (I do think rounding humans up into shelters is relevant.)
I’m surprised at the “only 3 OOMs more expensive.” I haven’t done a calculation but that seems really implausible. Maybe I am not imagining how big an OOM is properly.
In particular, comparing “keep humans alive indefinitely, multigenerationally” vs “store them on ice without evening running them, turn them back on when/if you trade them.”
The sort of entity that might pay extra for “keep them actually alive instead of on-ice”… probably also cares about them being alive and getting some kind of standard of living or something. (fyi it’s not even obvious most people would prefer being alive in minimally-satisfying-shelters vs stored, or run in simulation at a higher standard of living)
Insofar as this option is loadbearing for “you put lowish odds on them killing everyone”, I think a crux is not finding it very plausible that the AI would choose “keep fully alive in a way that is clearly better than death” vs “store digitally” or “upload.” It’s just a very narrow target for the game theory to work out such that this exact thing is what turns out to matter.
- Raemon 23 Sep 2025 18:43 UTC
  7 points
  3
  Parent
  I hadn’t seen this footnote yet when writing the above:
  Consensual uploads, or uploads people are fine with on reflection, don’t count as death. ↩︎
  There’s a range of stuff like “the AI manuevered us into a situation where we either die, or live in world with X standard of living, or get uploaded and get 1000x (or whatever) standard of living.
  Most people on reflection choose uploading. I think it’s reasonable to disagree with whether this counts as “death”, but, this seems like pretty ambiguous consent at best to me, and while I think most fully informed people wouldn’t count is as “death” per se, most people with their current set of beliefs/values would count it more like death than not-death, or, not be very impressed by arguments that it doesn’t count as death.
  (Curious if there is any kind of mechanical turk poll we could run that would change either of our minds about this? Not sure that it matters that much)
  I think this sort of thing is also why I don’t think the “AI may be very slightly nice” is likely to result in something that’s clearly “not death”. It seems really unlikely to me that very slightly nice things would go the “preserve parochial humans exactly as is”, it’s just such a narrow target to hit even within niceness.
  - ryan_greenblatt 23 Sep 2025 19:31 UTC
    4 points
    2
    Parent
    It’s kinda messy how to think about non-consensual uploads that the person is totally fine with after some reflection (especially if the reason the AI did the upload is because it know the person would be fine with this on reflection). I also don’t think considering uploads without informed consent to be death makes a big difference to my numbers.
    
    I think this sort of thing is also why I don’t think the “AI may be very slightly nice” is likely to result in something that’s clearly “not death”. It seems really unlikely to me that very slightly nice things would go the “preserve parochial humans exactly as is”, it’s just such a narrow target to hit even within niceness.
    
    I don’t really agree? It seems like “don’t upload people without their consent when they consider this to be death unless there are no better options” is pretty natural and the increase in resources for keeping people physically alive is pretty small.
    - Raemon 23 Sep 2025 19:55 UTC
      4 points
      2
      Parent
      (FYI I think I am sold by your other reply that the main question is “how much is it slowed down initially, and how much does it value the resources that it loses by doing so?”. I agree leaving us with one star isn’t that big a deal, modulo making sure we can’t somehow mess up the rest of it’s plans, which doesn’t seem that hard)
- ryan_greenblatt 23 Sep 2025 19:23 UTC
  2 points
  −2
  Parent
  
  How much of your plausibility-of-caring-slightly routes through this, vs the “actually just slightly intrinisically nice?” thing.
  
  Maybe 60% trade, 40% intrinsic? Idk though.
  
  I’m surprised at the “only 3 OOMs more expensive.” I haven’t done a calculation but that seems really implausible. Maybe I am not imagining how big an OOM is properly.
  
  In particular, comparing “keep humans alive indefinitely, multigenerationally” vs “store them on ice without evening running them, turn them back on when/if you trade them.”
  
  The dominant cost of “keeping humans alive physically” from a linear-returns+patient industrial expansion perspective is a small delay from rounding humans up into shelters (that you have to build and supply etc) or avoiding boiling the oceans (and other catastrophic environmental damage). The dominant cost of uploads is the small delay from rounding up and scanning people before they die. They both seem small, but the delays seem comparable.
  
  Giving humans an entire star long term (e.g. after a few decades) is negligible in cost (like, $<< 10^{- 20}$ of all galactic resources) relative to this delay (edit: for patient AIs which don’t prefer resources closer to earth), so I think keeping humans physically alive longer term is totally fine and just has higher upfront costs.
  
  There’s also the cost of not killing humans as part of a takeover attempt and not using them for some other purpose (reasons (1) and (3)), but these are equal between uploads and physically keeping humans alive. I wasn’t intending to include this as I said “after AIs have fully solidified their power”, but if I did, then this makes the gap much smaller depending on how important reason (1) is.
  - Raemon 23 Sep 2025 19:30 UTC
    2 points
    0
    Parent
    Mmm, nod. I maybe see it.
    RE: “how long are you delaying?”
    It seems like a major early choice is “is the AI basically using the Earth to bootstrap the intergalactic probe process” or “is the AI only using the amount of Earth resources that leaves it mostly functional from human’s perspective, and then mostly using the rest of the planets.”
    You note “seems like it’d only slow down the AI a bit, to first round up humans”. I haven’t done a calculation about it, but, seemed potentially like a very big deal to decide between “full steam ahead on beginning the dyson sphere using Earth Parts” vs “start the process from the moon and then mercury”, since there are harder-to-circumvent delays there.
    - ryan_greenblatt 23 Sep 2025 19:35 UTC
      2 points
      −2
      Parent
      
      You note “seems like it’d only slow down the AI a bit, to first round up humans”. I haven’t done a calculation about it, but, seemed potentially like a very big deal to decide between “full steam ahead on beginning the dyson sphere using Earth Parts” vs “start the process from the moon and then mercury”, since there are harder-to-circumvent delays there.
      
      Can’t the AI proceed full steam ahead until it has enough industrial capacity to build shelters, then build shelters and put humans in the shelters (while pausing as needed at this point to avoid fatalities from environmental damage while humans are being rounded up), and then finish the industrial expansion (possibly upgrading shelters along the way as needed as the AI gets more resources)? Seems like naively this only delays you for as long as is needed to round up humans and up them in shelter which seems probably <1 year and probably <1 month. (At least if takeoff is pretty fast.)
      
      Separately, not destroying the earth (and instead doing more of the growth in space) seems like it should cost <3 years of delay and probably <1 year of delay which is still pretty small as an absolute fraction of resources (for patient AIs). Like we’re talking 1 / billion or something.
      - Raemon 23 Sep 2025 19:38 UTC
        2 points
        0
        Parent
        I agree it’s small as a fraction of resources, but it still seems very expensive in terms of total resources since that’s a lotta galaxies falling outside the lightcone.