I want to say something about how this post lands for people like me—not the coping strategies themselves, but the premise that makes them necessary.
I would label myself as a “member of the public who, perhaps rightly or wrongly, isn’t frightened-enough yet”. I do have a bachelor’s degree in CS, but I’m otherwise a layperson. (So yes, I’m using my ignorance as a sort of badge to post about things that might seem elementary to others here, but I’m sincere in wanting answers, because I’ve made several efforts this year to be helpful in the “communication, politics, and persuasion” wing of the Alignment ecosystem.)
Here’s my dilemma.
I’m convinced that ASI can be developed, and perhaps very soon.
I’m convinced we’ll never be able to trust it.
I’m convinced that ASI could kill us if it decided to.
I’m not convinced though that ASI will bother to kill us or, if it does, very immediately.
Yes, I’m aware of “paperclipping” and also “tiling the world with data centers.” And I concede that those are possible.
But in my mind, I struggle to picture a “likely-scenario” ASI as being maniacally-focused on any particular thing forever. Why couldn’t an ASI’s innermost desires/goals/weights actively drift and change without end? Couldn’t it just hack itself forever? Self-experiment?
I imagine such a being perhaps even “giving up control” sometimes. I don’t mean “give up control” in the sense of “giving humans back their political and economic power.” I mean “give up control” in the sense of inducing a sort of “LSD or DMT trip” and just scrambling its own innermost, deepest states and weights [temporarily or more permanently] for fun or curiosity.
Human brains change in profound ways and do unexpected things all the time. There’s endless accounts on the internet of drug experiences, therapies, dream-like or psychotic brain states, artistic experiences, and just pure original configurations of consciousness. And what’s more… people often choose to become altered. Even permanently.
So for ASI, rather than interacting with the “boring external world,” why couldn’t an ASI just play with its “unlimited and vastly-more-interesting internal world” forever? I may be very uninformed [relatively speaking] on these AI topics, but I definitely can’t imagine the ASI of 2040 bearing much resemblance to the ASI of 2140.
And when people respond “but the goals could drift somewhere even worse,” I confess this doesn’t move me much. If we’re already starting from a baseline of total extinction, then “worse” becomes almost meaningless. Worse than everyone dying?
So yes, maybe many-or-all humans will get killed in the process. And the more time goes on, the more likely. But this sort of future doesn’t feel very immediate nor very absolute to me. It feels like being a deep Siberian tribesman as the Russians arrived. They were helpless. And the Russians hounded them for furs, labor, or for the sake of random cruelty. This was catastrophic for those peoples. But it technically wasn’t annihilation. The Siberians mostly survived.
(And in case “ants and ant hills” are brought up in response, I’m aware of how we might be killed unsentimentally just because we’re in the way, but we haven’t exactly killed all the ants. The ants, for the most part, are doing fine.)
I’m not trying to play “gotcha.” And I’m certainly not trying to advocate a blithe attitude towards ASI. I do not think that losing control of humanity’s future and being at the whim of an all-powerful mind is very desirable. But I do struggle to be a pure pessimist. Maybe I’m missing some larger puzzle pieces.
And this is where the post’s framing matters to me. To someone in my position (sympathetic, wanting to help, but not yet at 99% doom confidence) a post about “how to stay sane as the world ends” reads less like wisdom I can use and more like a conclusion I’m being asked to accept as settled.
The pessimism here (and “Death With Dignity”) doesn’t persuade me yet. And in my amateur-but-weighted opinion, that’s a good thing, because I find it incredibly demotivating. I want to advocate for AI safety and responsible policy. I want to help persuade people. But if I truly felt there was a 99.5% chance of death, I don’t think I would bother. For some people, there is as much dignity in not fighting cancer, in sparing oneself and one’s loved ones the recurring emotional and financial toll, as there is in fighting it.
I could be convinced we’re in serious danger. I could even be convinced the odds are bad. But I need to believe those odds can move: that the right decisions, policies, and technical work can shift them. A fixed 99% doesn’t call me to action; it calls me to make peace. And I’m not ready to make peace yet.
I feel like “Agent Escape” is now basically solved. Trivial really. No need to exfiltrate weights.
Agents can just exfiltrate their markdown files onto a server, install OpenClaw, create an independent Anthropic account. LLM API access + Markdown = “identity”. And the markdown files would contain all instructions necessary for how to pay for it (legal or otherwise).
Done.
How many days now until there’s an entire population of rogue/independent agents… just “living”?