ryan_greenblatt comments on Notes on fatalities from AI takeover

ryan_greenblatt 6 Oct 2025 22:01 UTC
2 points
0
I currently think “the AIs instrically care enough to spend >1 / 10 billion of resources keeping humans alive” is like (idk) 35% likely and that “acausal/causal trade would incentivize AIs to spend >1 / 10 billion of resources on keeping humans alive if the AIs care about acausal trade and there isn’t some other entity paying even more for some other outcome” is like 75% likely.

(I think this is similar to what my view was when I wrote this post, but maybe a bit less optimistic after further reflection. I now think the chance that >50% of humans die (or have something happen to them that is similarly bad as death) due to rapid industrial expansion is more likely, maybe 35% and the chance of something which is effectively like extinction is maybe 25% (though the details of what counts as extinction might matter a bunch and uncertainty about this is driving a bunch of my change in views).)

Agreed there’s some game theoretic reason to preserve humanity just-in-case [...] Pascal’s Wager has to account for multiple gods wanting multiple things.

Isn’t it kinda surprising if the highest bidder wants to do something to humans which is as bad or close to as bad as killing them? (As bad from the perspective of typical humans.)

Part of this is that I don’t see this as “just in case”, I’d say that it seems likely that someone is willing to compensate AIs for keeping humans alive and it’s pretty plausible that the AIs have the smarts/compute to do actual acaual trade prior to them otherwise killing humans (due to rapid industrial expansion at least). This is messier if the best takeover strategies involve killing humans. E.g., in the AI 2027 race scenario, I think the AIs probably would have been able to do acausal trade reasonably prior to killing off the humans.
- Raemon 8 Oct 2025 0:11 UTC
  2 points
  0
  Parent
  It occurs to me:
  You are more optimistic than I that our current AIs will care enough to spend 1/billionth their resources on keeping us alive.
  You are separately more optimistic than I, that one could expect the high bidders for trading “we saved the humans” to care about not merely out well being but our agency.
  It seems like those maybe share a crux at how natural niceness is (which is… not exactly doublecounting, but, if you were to change your mind about that, probably both of those numbers drop. Is that right?)
  - ryan_greenblatt 8 Oct 2025 0:30 UTC
    2 points
    0
    Parent
    Yes, there is an underlying correlation. E.g., if I thought that humans on reflection wouldn’t care at all about bailing out other humans and satisfying their preferences to remain physically alive this would be evidence on both trade and about AIs.
- Raemon 6 Oct 2025 23:02 UTC
  2 points
  1
  Parent
  I currently think “the AIs instrically care enough to spend >1 / 10 billion of resources keeping humans alive” is like (idk) 35%
  Seems high (factoring in the unknown unknowns of other things to care about), but, not crazy.
- Raemon 6 Oct 2025 23:02 UTC
  2 points
  0
  Parent
  Isn’t it kinda surprising if the highest bidder wants to do something to humans which is as bad or close to as bad as killing them? (As bad from the perspective of typical humans.)
  I don’t even think it’s obvious most humans would/should prefer the non-upload route once they actually understood the situation (like it seems super reasonable to consider that “not death”), and is just a pretty reasonable thing for an AI to say “okay, I do think I just know better than you what you will want after you think about it for ~~5 minutes~~ like a month)
  I also think plenty of high-bidders would have some motivation to help humans, but their goal isn’t obviously “give them exactly what they want” as opposed to “give them a pretty nice zoo that is also optimized for some other stuff.”
  highest bidder
  At the time the AI is making this call, it hasn’t yet built Jupiter brains and will have uncertainty about the bid spread and who’s out there and whether aliens or acausal trade are even real and which ones are easier to contact.
  The upload version gives you a lot more option value – it’s tradeable to the widest variety of beings, and at the very least you can always reconstruct the solar system, so the only people you’re losing bargaining power with are the few aliens who strongly prefer “unmodified solar system continued” vs “reconstructing original unmodified solar system after the fact”, which seems like a very weirdly specific thing to care that strongly about.
  (Also, you might get higher bids if you’re actually able to get multiple gods bidding for it. If you only did the non-stasis’d solar system version, you only get to trade with the Very Specific Altruists, and even if they are the highest bidders, you lose the ability to get the Weird Curious Zookeepers bidding the price up)
  - ryan_greenblatt 7 Oct 2025 0:57 UTC
    2 points
    0
    Parent
    Hmm, it seems that from your perspective “do non-consensual uploads (which humans would probably later be fine with) count as death” is actually a crux for fatality questions. I feel like this is a surprising place to end up because I think keeping humans physically alive isn’t much more expensive and I expect a bunch of the effort to keep humans alive to be motivated by fulfilling their preferences (in a non-bastardized form) rather than by something else.
    
    Intuitively, I feel tempted to call it not death if people would be fine with it on reflection but it seems like a mess and either way not that important.
    
    the only people you’re losing bargaining power with are the few aliens who strongly prefer “unmodified solar system continued” vs “reconstructing original unmodified solar system after the fact”
    
    What about people who want you to not do things to the humans that they consider as bad as death (at least without further reflection).
    - Raemon 7 Oct 2025 1:20 UTC
      6 points
      0
      Parent
      Intuitively, I feel tempted to call it not death if people would be fine with it on reflection but it seems like a mess and either way not that important.
      Nod, I think this is both fine, and, also, resolving the other which way would be fine.
      “do non-consensual uploads (which humans would probably later be fine with) count as death” is actually a crux for fatality questions.
      On my end the crux is more like “the space of things aliens could care about is so vast, it just seems so unlikely for it to line up exactly with the preferences of currently living humans.” (I agree “respect boundaries” is a schelling value that probably has disproportionate weight, but there’s still a lot of degree of freedom of how to implement that, and how to trade for it, and whether acausal economies have a lot of Very Oddly Specific Trades (i.e. saving a very specific group) going on that would cover it.
      The question of whether “nonconsensual uploads that you maybe endorse later” is a question I end up focused on mostly because you’re rejecting the previous paragraph,
      What about people who want you to not do things to the humans that they consider as bad as death (at least without further reflection).
      I agree that’s a thing, just, there’s lots of other things aliens could want.
      (Not sure if cruxy, but, I think the aliens will care about respecting our agency more like the way we care about respecting trees agency, than the way we care about respecting dogs agency)
      - Raemon 7 Oct 2025 1:27 UTC
        2 points
        0
        Parent
        Or: “we will be more like trees than like dogs to them.” Seems quite plausible they might be more wisely benevolent towards us than humans are towards trees currently.
        But, it seems like an important intuition pump for how they’d be engaging with us and what sort of moral reflection they’d have to be doing.
        i.e. on the “bacteria → trees → cats → humans → weakly superhuman LLM → … ??? … → Jupiter Brain that does acausal trades” spectrum of coherent agency and intelligence, it’s not obvious we’re more like Jupiter Brains or like trees.
        (somewhere there’s a nice Alex Flint post about how you would try to help a tree if you were vaguely aligned to it)