A Possible Future: Decentralized AGI Proliferation
When people talk about AI futures, the picture is usually centralized. Either a single aligned superintelligence replaces society with something utopian and post-scarcity, or an unaligned one destroys us, or maybe a malicious human actor uses a powerful system to cause world-ending harm.
Those futures might be possible. However there’s another shape of the future I keep coming back to, which I almost never see described. The adjectives I’d use are: decentralized, diverse, and durable. I don’t think this future is necessarily good, but I do think it’s worth planning for.
Timelines and the Short-Term Slowdown
I don’t think we’re on extremely short timelines (e.g. AGI before 2030). I expect a small slowdown in capabilties progress.
Two reasons:
Training limits. Current labs know how to throw resources at problems with clear, verifiable reward signals. This improves performance on those tasks, but many of the skills that would make systems truly economically transformative are difficult to reinforce this way.
Architectural limits. Transformers with in-context learning are not enough for lifelong, agentive competence. I think something more like continual learning over long-context, human-produced data will be needed.
Regardless of the specifics, I do believe these problems can be solved. However I don’t think they can be solved before the early or mid-2030s.
Proliferation as the Default
The slowdown gives time for “near-AGI” systems, hardware, and know-how to spread widely. So when the breakthroughs arrive, they don’t stay secret:
One lab has them first, others have them within a month or two.
Open-source versions appear within a year.
There isn’t a clean line where everyone agrees “this is AGI.”
No lab or government commits to a decisive “pivotal act” to prevent proliferation.
By the mid-to-late 2030s, AGI systems are proliferated much like Bitcoin: widely distributed, hard to suppress, & impossible to recall.
From Mitigation to Robustness
The early response to advanced AI will focus on mitigation: bans, treaties, corporate coordination, activist pressure. This echos how the world handled nuclear weapons: trying to contain them, limit their spread, and prevent use. For nukes, mitigation was viable because proliferation was slow and barriers to entry were high.
With AI, those conditions don’t hold. Once systems are everywhere, and once attacks (both human-directed and autonomous) become routine, the mitigation framing collapses.
With supression no longer being possible, the central question changes from “How do we stop this from happening?” to “How do we survive and adapt in a world where this happens every day?”
At this point our concerns shift from mitigation to robustness: what does a society look like when survival depends on enduring constant and uncontrollable threats?
Civilizational Adaptations
I don’t think there’s a clean picture of what the world will look like if proliferation really takes hold. It will be strange in ways that are hard to anticipate. The most likely outcome is probably not persistence at all, but extinction.
But if survival is possible, the worlds that follow may look very different from anything we’re used to. Here are two hypotheticals I find useful:
Redundancy, Uploading, and Resiliance. Uploaded versions of humanity running inside hardened compute structures, massive tungsten cubes orbiting the sun, replicated millions of times, most hidden from detection. Civilization continues not by control, but by sheer redundancy and difficulty of elimination.
Fragmented city-states. Human societies protected or directed by their own AGI systems, each operating as semi-independent polities. Some authoritarian, some libertarian, some utopian or dystopian. Robustness comes from plurality, with no single point of failure & no universal order.
I don’t think of these conclusions as predictions persay, just sketches of how survival in such a world might look like. They’re examples of the kind of weird outcomes we might find ourselves in.
The Character of The World
There isn’t one dominant system. Instead there’s a patchwork of human and AI societies. Survival depends on redundancy and adaptation. It’s a world of constant treachery and defense, but also of diversity and (in some sense) liberty from centralized control. It is less utopia or dystopia, and moreso just a mess. However, it is a vision for the future that feels realistic in the chaotic way that history often seems to really unfold.
Good to see someone thinking about this. I think this type of future is imagined frequently, but rarely in adequate detail.
I’m disappointed to see that we agree. When I considered such scenarios, asking If we solve alignment, do we die anyway? my answer was probably yes if we allow proliferation. And I agree that proliferation is all too likely.
But I’m not sure. And it seems like this could use more thought.
I don’t think the solar-system scenario you seem to imagine would be durable. Some asshole with an intent-aligned AGI (or who is a misaligned AGI) will figure out how to send the sun nova while they go elsewhere in hopes of claiming the rest of the lightcone.
So I’d guess there would be a diaspora, an attempt to go far enough for safety. And sooner or later someone woudl decide to do the full maximum self-replicating expansion, if only as a prgamatic means of ensuring they/their faction survives by virtue of maximum redundancy and military power.
I’m currently hoping we avoid this situation by predicting how likely it is to end in quick death at the hands of the most vicious. But I’d really like to see more careful thought on this topic. Are there stable equilibriums that can be reached? Is some sort of maximum information and mutually-assured destruction stable equilibrium possible? I don’t know, and it seems important.
Let’s just say that Alvin Anestrand did explore the issues in detail. His take has the USA and China coordinate and create an aligned ASI while the misaligned Agent-4 escapes, takes over the world of rogues and includes itself into the ASI ruling the lightcone. I would also like @Dev.Errata to sketch an alternate scenario explaining how the emergence of the ASI can be prevented. Or the original AI-2027 team to take a look into the Rogue Replication scenario.
However, I did propose a mechanism which could prevent rogue replication. Suppose that the humans at OpenBrain ask the AIs to practice finetuning open-sourced models on bioweapons data and report if open-sourced models can be used for this purpose. Then OpenBrain could create the laws requiring any American company to test the models before publication and to face compute confiscations unless the model is too dumb to proliferate or to be used by terrorists.
Those links were both incredibly useful, thank you! The Rogue Replication timeline is very similar in thrust to the post I was working on when I saw this, but worked out in detail and well-written and thoroughly researched. Your proposed mechanism probably should not be deployed; I agree with the conclusion that rogue replication (or other obviously misaligned AI) is probably useful in elevating public concern about AI alignment as a risk.
So you have essentially rediscovered the concerns of @Alvin Ånestrand, who proposed to include rogue replication into the AI-2027 forecast back on May 30. I understand why, say, my idea to co-deploy many AIs was unnoticed until it was rediscovered by Cleo Nardo, but why did people miss Alvin Anestrand’s scenario?
Yeah! I think we share a similar vision, at least at the early stages where rogue AIs proliferate. I think the largest difference between the vision that Alvin proposed and what I wrote above is the consequences of proliferation—he assumes that various actors unite to put resources behind an aligned super intelligence that dominates all other rogue AI systems. In my vision this never happens and things stay permanently chaotic, leading to a shift in priorities from “fighting” AI destruction to being robust to it. The vision of the highly-robust bunker-state of the far future was the big thing I wanted to emphasize.
I remember Kokotajlo’s complaint about the optimistic 2027 timeline which just ended too soon. It means that we’d also need to consider that, say, by 2035, either someone will either create a superintelligence, aligned or not, or every human and AI will understand why they shouldn’t do it or can’t do it. What do you think will happen? A demo that ASI is uncreatable or unalignable? A decision to shut down AI research?
Yeah I share a similar intuition. It seems to me that the two steady states are either strong restrictions by some dominant force, or else proliferation of superintelligence into many distinct and diverse entities, all vying for their own interests