Plan E for AI Doom

Ihor Kendiukhov17 Aug 2025 15:26 UTC

67 points

Firstly, let me be clear: I do not want to signal my pessimism, nor do I think that everything is that hopeless with AI. But I do think that the question of “what useful things can be done even if we accept the premise that AI-induced extinction is inevitably coming?” is worth being considered, and it would be pretty awkward if the world ends indeed and everyone realises that no one has really asked this question before (maybe someone did, let me know if it is the case).

Secondly, as you will see, I suggest some potential things to do, trying to answer this question. This is meant to be an illustration and a kind of proof-of-concept. Probably, many of these ideas do not make sense. However, it is not obvious to me that there are no useful ideas that can be invented in this regard if some people think relatively hard. Even the ideas I describe do not sound totally hopeless to me.

There are many alignment plans, but now I want to talk about plan E, where E stands for extinction.

The physics lever

Speed-of-light is a wall. An adversary with arbitrary local power cannot retroactively retrieve photons that have already left our future light cone. That observation makes a narrow window valuable: whatever we want preserved must be radiated now, redundantly, and in a form that minds unlike ours can decode with high probability. The useful direction is not “build a grand vault” but “broadcast a self-bootstrapping curriculum until the last switch flips.” The ASI can stop new transmissions and can chase old ones, but it will always be running behind the shell.

The hope is that somewhere—another biological civilisation, a better AI, a far-future archeologist, or even a simulator filtering on moral —someone receives a clean, self-describing record of who we were and what we cared about. If only one such receiver exists in our future light cone, and if only one of our transmissions decodes, that still changes how much of human value reaches the wider universe.

The logic and morality lever

It does not sound very promising, but we could probably focus on creating messages for transmission that, even if unreliably, demonstrate our niceness to other potential agents out there that are also nice. Something that we ourselves would consider nice, upon receiving. Ideally, of course, we could aim for some logical coordination, precommitments, acausal trade, verification, and the stuff, but I do not know how to do it.

But what is feasible is to send the messages simply claiming that we are sentient, we want to be reconstructed, and we adhere to these principles (among others):

Cooperate with recognizable peers in transparent one-shot interactions.
No blackmail. We don’t pay it; we don’t do it.
Take Pareto-improving trades under logical correlation. If both of us do better by committing, we commit.
Strong s-risk aversion and substrate-neutral personhood. Reduce involuntary suffering; treat minds as persons regardless of the material they run on.

Additionally, if we have an opportunity, we could add an alignment failure report (what we tried; where it broke), not to impress anyone, but to show we prefer truthful accounting over PR.

This doesn’t prove anything for other agents, but even so, nice agents could respond to it.

The welfare lever

There is still the matter of how the story ends. Even if the interval between “it starts” and “it finishes” is short, it will contain a great deal of potential suffering. Here, usefulness looks unromantic: palliative protocols optimized for scarcity; clear, open instructions for pain control, psychological first aid, dignified rituals, and ways for families to connect when infrastructure disappears; “humane shutdown” procedures for captive animals and facilities that would otherwise produce prolonged harm if abandoned; conflict dampers and ceasefire compacts that shift the default away from last-mile violence over dwindling resources. None of this requires coordination with a hostile superintelligence. All of it is compressible into leaflets, sms-length snippets, and audio recordings that can be pre-positioned.

If you think this is performative, notice that, conditional on doom, marginal suffering averted per dollar here is extraordinarily high, and conditional on survival, you’ve just improved emergency preparedness.

Hopelessness does not necessarily mean Plan E is the only reasonable option

On a more positive note:

As I stated in the beginning, I do not think everything is hopeless, but that is up to you to decide. Maybe it is or maybe it will be.

However, even if we accept the most radical doomerism and if things look hopeless in this timeline, work on AI safety still sort of makes sense under a Many-Worlds lens. Our choices still can increase the share (measure) of branches where alignment succeeds or the ending is kinder—even if our own branch still loses.

It seems correct to claim that safety research, governance pressure, and AI pause activism still push a little more weight toward good outcomes across near-identical histories. That may be a real impact, even if we never get to live in one of those branches. This, of course, triggers the entire discussion on Many-Worlds and the stuff, but I skip it.

What I’m explicitly not proposing

I’m not proposing “hide the seeds on the Moon,” “bury tungsten slates in the Atacama,” or “launch a thousand postcards toward Alpha Cen.” Under the assumed threat model, a decisive optimiser sweeps the Solar System; anything we can reach, it can reach faster and more thoroughly. Nor am I proposing elaborate active beacons that depend on long-term maintenance. Plan E is not about artifacts that can be confiscated, it’s about structures that have already left or minimising suffering pre-singularity.

I am also not necessarily claiming that some fraction of people should pivot to Plan E. But I think it is worth thinking about.

What links here?

Ihor Kendiukhov's comment on Plans A, B, C, and D for misalignment risk by ryan_greenblatt (10 Oct 2025 14:33 UTC; 1 point)

Ihor Kendiukhov17 Aug 2025 15:26 UTC

67 points

15 comments3 min readLW link

AI AI Timelines

Davidmanheim 17 Aug 2025 19:45 UTC
14 points
8
There’s a lot of energy expenditure needed to make the kinds of broadcasts you discuss, and even then, it won’t go far—aiming for specific targets with relatively high probability of reception helps a bit, but not much. Given that, this all seems low value compared to even short term benefits that we could get with the same effort or at the same cost.
- Joern Stoehler 18 Aug 2025 9:11 UTC
  2 points
  0
  Parent
  Concretely I guess current tech can get a message out to a few targets at 10^3 to 10^6 light years distance. ASI can use many physical probes near light speed, accelerated using energy from a Dyson swarm, so I’d guess it’s a few years behind only. I don’t expect there to be aliens within 10^6 lys, nor do we know which stars, and it’s again unlikely that they happen to be in the thin window of technological development where a warning message from us helps them.
  - Davidmanheim 18 Aug 2025 17:38 UTC
    3 points
    2
    Parent
    And the receiver won’t know where to look, and we don’t know where to send it, so given attenuation, even 10^4 light years seems pretty optimistic.
Thane Ruthenis 17 Aug 2025 22:10 UTC
6 points
2
But what is feasible is to send the messages simply claiming that we are sentient, we want to be reconstructed
Clever. Issue: Any signal we send may only intersect the expanding frontier of the nearest alien civilization 0.2-1 billion years later; i. e., at a distance 0.2-2 billion light-years away. Can we send a signal that could actually be picked up?
Let’s suppose the alien civilization is nice, vast (spans many galaxies by the time the signal reaches it), and is specifically listening for these types of “reconstruct-us” messages, sent from distant low-tech low-power civilizations facing extinction/genocide. Intuitively, it should be relatively cheap for them to build coordinated arrays of receivers spanning galaxy-sized volumes, with a correspondingly ridiculous effective surface area. We could further assume there’s a civilization like this in any direction we could point.^[1]
Suppose we had $10 million to spend on the project, and had 2 years until we’re unable to broadcast. How much information would we be able to transmit?
I don’t know how to calculate it off the top of my head, so I went and asked LLMs. Their answers varied wildly between models and samples, but they usually said that ~100 GB was achievable. Here’s one of GPT-5′s answers; I’d love if someone sanity-checked that.
That should be enough information to reconstruct us, I think, if we use dense compression and send high-quality data? (Not reconstructing any specific person, of course, but the general shape of the human mind and human values.)
Neat. This is the only desperate Hail Mary plan I’ve heard that actually sounds plausible. I’d donate a modest amount to that.
(Aside: relevant Sam Hughes short story.)
1. ^
  This isn’t all that many assumptions. It’s basically just “the grabby-alien model is approximately correct” and “probability that any given alien civilization is nice”. If sending such messages is feasible, the nice alien civilizations are pretty likely to be listening to them.
- Trevor Hill-Hand 18 Aug 2025 23:51 UTC
  3 points
  0
  Parent
  https://www.youtube.com/watch?v=_OpxrtUwjNw—This is a little fan project I did of that short story, as a sort of a radio play. I’ve never had it be relevant to a conversation before!
Ihor Kendiukhov 18 Aug 2025 12:36 UTC
5 points
0
Thank you all for your comments and feedback! No matter how pleased I am with the active reception of my idea, the very same thing also makes me feel sad, for obvious reasons.
I agree that sending transmissions of sufficient intensity can be challenging and may be a dealbreaker. It would be great if someone did proper calculations; perhaps I will do them.
However, I want to emphasise one thing which I probably did not emphasise enough in the article itself: for me, at this point, it is more about acknowledging that something useful can be done even if AI doom is imminent and creating a list of ideas rather than discussing and implementing selected few ideas. I gave specific ideas more for the sake of illustration, although it is, of course, good if they can play out.
It may just be that suggesting additional ideas for plan E is genuinely hard, so no one did it, but maybe I did not create a proper call to action, so I am doing it now.
- Davidmanheim 24 Aug 2025 7:52 UTC
  2 points
  0
  Parent
  I asked an LLM to do the math explicitly, and I think it shows that it’s pretty infeasible—you need a large portion of total global power output, and even then you need to know who’s receiving the message, you can’t do a broad transmission.
  
  I also think this plan preserves almost nothing I care about. At the same time, at least it’s realistic about our current trajectory, so I think planning along these lines and making the case for doing it clearly and publicly is on net good, even if I’m skeptical of the specific details you suggested, and don’t think it’s particularly great even if we succeed.
keltan 19 Aug 2025 1:00 UTC
4 points
1
I think this is a good idea. I commiserate with you, that it is a plan we must consider, but agree, that yes, we must consider these types of plans.
I dedicated an hour of thought to this last night. But couldn’t come up with anything better than radio shot at precise locations. All my other ideas, were either based on theory, caused more risk than they’d be worth, or were more costly than radio waves.
The main problem with Radio Waves, is the inverse square law. Which can be fixed, as suggested, by beaming in certain directions, instead of as a sphere. But, that greatly decreases the probability of the message ever being received.
The ideal tech-I think-would have these properties:
- Expands in 3 dimensions
- Is encoded as an oscillating signal
- Is powerful enough to be readable, further than “I Love Lucy”^[1] (~2 light-years)
- Works more like Ripples in a pond, than a scatter shot of photons
The only option that sounds better than radio waves to me, would be gravitational waves. But I know basically nothing about them, and my understanding is that they are theoretical.
1. ^
  This is me saying “Ambient TV and Radio broadcasts”, in a tongue and cheek way.
- Trevor Cappallo 21 Aug 2025 11:34 UTC
  1 point
  0
  Parent
  As of about ten years ago, gravitational waves are no longer theoretical. That said, I think that given the profound relative weakness of gravitational force, the technology to be able to generate waves is far out of reach; naively, I’d expect we’d have to be able to wiggle multiple-stellar-mass objects around in a controlled way.
Trevor Cappallo 17 Aug 2025 16:20 UTC
4 points
0
I don’t realistically see this happening, but I agree it’s worthwhile to say and to think about. I assumed any change we made could be instantly undone by a hostile ASI, but you raise an interesting point; relying on a hard speed-of-light limit seems like it could be our best shot at a legitimate counter, so far as it goes.

The question of what information could or should be transmitted seems likely to be contentious, however. Even provided we gain the ability to do it before it becomes moot, broadcasting information sufficient to reconstitute ourselves or something similar strikes me as yet another big roll of the dice. This is particularly true given that we can’t know in advance whether or not we’re bound for an ASI failure mode, and as soon as we do, it’s presumably too late.
nowl 17 Aug 2025 18:48 UTC
3 points
0
That observation makes a narrow window valuable: whatever we want preserved must be radiated now, redundantly, and in a form that minds unlike ours can decode with high probability. The useful direction is not “build a grand vault” but “broadcast a self-bootstrapping curriculum until the last switch flips.”
A way to do this without needing to create a formal (“universal language”) alignment curriculum would be to just broadcast a lot of internet data, somehow emphasizing dictionaries (easier to first interpret) and LessWrong text. One way to emphasize would be to send them more times. Maybe include some formalisms that try to indicate the concept of language.
In case we’re already able to radiate arbitrary bitsequences, there might not be any large hurdles to doing this.
BryceStansfield 19 Aug 2025 2:00 UTC
1 point
0
Am I alone in not seeing any positive value whatsoever in humanity, or specific human beings, being reconstructed? If anything, it just seems to increase the S-risk of humanlike creatures being tortured by this ASI.

As for more abstract human values, I’m not remotely convinced that we could either:

a) Convince such a more technologically advanced civilization to update towards our values.

or

b) That they would interpret them in a way that’s meaningful to me, and not actively contra my interests.
noggin-scratcher 17 Aug 2025 17:06 UTC
1 point
0
and we adhere to these principles
Do we though? As a species? I suppose we can claim to, as part of a transmission to try to persuade aliens of our niceness. But if they’re able to receive and decode a transmission it seems like there’s reasonable odds they’ll also have other observations of us that demonstrate our less worthy aspects.
- Ihor Kendiukhov 17 Aug 2025 17:26 UTC
  5 points
  2
  Parent
  As a group of people who compiled and sent this message.
FlorianH 18 Aug 2025 15:31 UTC
−1 points
0
I don’t mean the start as negatively as it sounds and mean to use it just as a base for a proposition following below:
demonstrate our niceness
‘Idk where you got that from’ - given how until today as a society we treat about everything that doesn’t exactly correspond to our narrow egoistic desires.. - ok, that is obviously too cynical; realistically we have to message them: we actually with some regularity try to be nice as long as its in low-cost warm-glow situations and so on—which isn’t nothing.
But given this arguably extreme limit to our niceness: So what could we realistically expect/hope from non-fooled interceptors of our messages?
Maybe we can
1. warn them about the ASI whose lightcone may soon reach their spacetime location; they better get prepared*
2. about us—maybe not waste many bytes, maybe? Idk why they’d attribute particular value to our pleas for reconstitution or so. If they’re themselves altruist conscious entities who care about creating positive utility they’ll have discovered ways to create utility far more efficiently than by reconstituting us!?
* Whether 1. would be positive value at all I reckon essentially depends on whether interceptor has higher probability to be ‘nice’/more positive utility-generating than the ASI we’ll leave behind. No immediate clue what the odds here are.
Happy to we wrong in any of this—am I?