Plan E for AI Doom

Firstly, let me be clear: I do not want to signal my pessimism, nor do I think that everything is that hopeless with AI. But I do think that the question of “what useful things can be done even if we accept the premise that AI-induced extinction is inevitably coming?” is worth being considered, and it would be pretty awkward if the world ends indeed and everyone realises that no one has really asked this question before (maybe someone did, let me know if it is the case).

Secondly, as you will see, I suggest some potential things to do, trying to answer this question. This is meant to be an illustration and a kind of proof-of-concept. Probably, many of these ideas do not make sense. However, it is not obvious to me that there are no useful ideas that can be invented in this regard if some people think relatively hard. Even the ideas I describe do not sound totally hopeless to me.

There are many alignment plans, but now I want to talk about plan E, where E stands for extinction.

The physics lever

Speed-of-light is a wall. An adversary with arbitrary local power cannot retroactively retrieve photons that have already left our future light cone. That observation makes a narrow window valuable: whatever we want preserved must be radiated now, redundantly, and in a form that minds unlike ours can decode with high probability. The useful direction is not “build a grand vault” but “broadcast a self-bootstrapping curriculum until the last switch flips.” The ASI can stop new transmissions and can chase old ones, but it will always be running behind the shell.

The hope is that somewhere—another biological civilisation, a better AI, a far-future archeologist, or even a simulator filtering on moral —someone receives a clean, self-describing record of who we were and what we cared about. If only one such receiver exists in our future light cone, and if only one of our transmissions decodes, that still changes how much of human value reaches the wider universe.

The logic and morality lever

It does not sound very promising, but we could probably focus on creating messages for transmission that, even if unreliably, demonstrate our niceness to other potential agents out there that are also nice. Something that we ourselves would consider nice, upon receiving. Ideally, of course, we could aim for some logical coordination, precommitments, acausal trade, verification, and the stuff, but I do not know how to do it.

But what is feasible is to send the messages simply claiming that we are sentient, we want to be reconstructed, and we adhere to these principles (among others):

  1. Cooperate with recognizable peers in transparent one-shot interactions.

  2. No blackmail. We don’t pay it; we don’t do it.

  3. Take Pareto-improving trades under logical correlation. If both of us do better by committing, we commit.

  4. Strong s-risk aversion and substrate-neutral personhood. Reduce involuntary suffering; treat minds as persons regardless of the material they run on.

Additionally, if we have an opportunity, we could add an alignment failure report (what we tried; where it broke), not to impress anyone, but to show we prefer truthful accounting over PR.

This doesn’t prove anything for other agents, but even so, nice agents could respond to it.

The welfare lever

There is still the matter of how the story ends. Even if the interval between “it starts” and “it finishes” is short, it will contain a great deal of potential suffering. Here, usefulness looks unromantic: palliative protocols optimized for scarcity; clear, open instructions for pain control, psychological first aid, dignified rituals, and ways for families to connect when infrastructure disappears; “humane shutdown” procedures for captive animals and facilities that would otherwise produce prolonged harm if abandoned; conflict dampers and ceasefire compacts that shift the default away from last-mile violence over dwindling resources. None of this requires coordination with a hostile superintelligence. All of it is compressible into leaflets, sms-length snippets, and audio recordings that can be pre-positioned.

If you think this is performative, notice that, conditional on doom, marginal suffering averted per dollar here is extraordinarily high, and conditional on survival, you’ve just improved emergency preparedness.

Hopelessness does not necessarily mean Plan E is the only reasonable option

On a more positive note:

As I stated in the beginning, I do not think everything is hopeless, but that is up to you to decide. Maybe it is or maybe it will be.

However, even if we accept the most radical doomerism and if things look hopeless in this timeline, work on AI safety still sort of makes sense under a Many-Worlds lens. Our choices still can increase the share (measure) of branches where alignment succeeds or the ending is kinder—even if our own branch still loses.

It seems correct to claim that safety research, governance pressure, and AI pause activism still push a little more weight toward good outcomes across near-identical histories. That may be a real impact, even if we never get to live in one of those branches. This, of course, triggers the entire discussion on Many-Worlds and the stuff, but I skip it.

What I’m explicitly not proposing

I’m not proposing “hide the seeds on the Moon,” “bury tungsten slates in the Atacama,” or “launch a thousand postcards toward Alpha Cen.” Under the assumed threat model, a decisive optimiser sweeps the Solar System; anything we can reach, it can reach faster and more thoroughly. Nor am I proposing elaborate active beacons that depend on long-term maintenance. Plan E is not about artifacts that can be confiscated, it’s about structures that have already left or minimising suffering pre-singularity.

I am also not necessarily claiming that some fraction of people should pivot to Plan E. But I think it is worth thinking about.