Superintelligence vs. The Second Strike

Crosspost of my substack piece, covering quick thoughts on AI overcoming nuclear deterrence. TLDR: Nuclear deterrents likely only buy time to further invest in more resilient second-strike guarantees: without a comparable AI base, this will not happen fast enough and even nuclear states will eventually be disempowered.

Historically, plenty of new military technologies have stress-tested nuclear deterrence. ICBMs made it possible to annihilate enemy cities from the safety of the homeland, MIRVs let a single rocket threaten multiple targets, and thermonuclear staging allowed weapons designers to reach functionally unlimited yield. In the already volatile climate of the Cold War, the U.S. and Soviets reached such mastery over missile technology that remote annihilation of an entire country was, quite literally, a button press away.

File:Peacekeeper-missile-testing.jpg — For decades, even a single rocket has been able to hold more than 10 warheads—each enough to destroy a city on their own. Peacemaker reentry tests pictured above.

The fact that the ability to remote detonate Moscow never translated into a nuclear war is a function of modern deterrence theory, dumb luck, and most importantly, the speed of progress. As effective as a modern ICBM is, each piece of it was individually low-impact enough, and introduced slowly enough, that there was never a point at which deterrence could be fully overturned. For comparison, imagine if the U.S. had acquired a fully realized ICBM in the mid 50s, back when the Soviets were still using bombers and hadn’t yet fielded a nuclear submarine. The U.S. would have been dearly tempted to strike first before the Soviets managed to diversify their nuclear forces, much as the Soviets would have been tempted to lash out before America decided to drop the guillotine.

Fortunately, the march of progress has always been slow enough to let rival states proactively invest in their second strike assurances. Unfortunately, the march is about to turn into a sprint.

Like all good essays, this one is about AI. In the process of recursive self-improvement towards godlike superintelligence, the American government is going to stumble onto the obvious idea of using it to automate military R&D—and in the process, likely leap several years, decades, or centuries up the tech tree relative to their rivals. For this technological edge to translate into a decisive strategic advantage, however, states would need to overcome even the most potent nuclear deterrents their rivals could build.

Broadly, this could happen in three ways.

Splendid first strikes—It becomes possible to either locate and destroy all of the enemy’s counterforce, or to fully decapitate nuclear command and control.
WMD defenses—Defensive systems are implemented that let the attacker neutralize both a retaliatory missile strike and non-missile means of delivery (smuggling, coastal torpedoes, etc).
Escalation management—The defender can be convinced not to launch a retaliatory strike, by carefully salami-slicing their disempowerment and/or using persuasion to manipulate their decisionmaking.

Splendid First Strikes: In order for a first strike to succeed, the attacker would need to either find and destroy all counterforce targets, or to fully decapitate strategic command and control. Broadly, I think that this would be possible with a large technological lead, but not with a high enough level of certainty to justify the risk of a proactive first strike.

In order for a counterforce strategy to succeed, a country would need to simultaneously find and destroy every leg of the defending state’s nuclear triad, including their land silos/mobile launchers, bombers, and SSBNs. This could either be accomplished through detection technology that narrows down the area in which the counterforce is located (ex: ocean wake mapping for satellites), or by simply flooding the oceans and space with autonomous sensors. Even once located, however, the attacker would still need to simultaneously destroy each target, leaving no time for the defender to authorize a retaliatory strike from the surviving counterforce. This limitation is especially constraining for SSBNs, given that the attacker would need to spend their finite reserve of nuclear warheads on large swathes of ocean in order to be confident the subs were destroyed (a much more severe limitation for China, given that it only has ~600 nuclear warheads overall). I place low credence on nuclear deterrence being undermined through counterforce alone, especially since defending states can cheaply invest in camouflage and decoy vehicles to increase the filtering and targeting requirements.

There are similar coverage problems with attempting to sever NC3. Here, the challenge is to destroy the central command and satellite command nodes, as well as proactively sabotaging any automatic retaliatory systems that exist. These, of course, are highly redundant in terms of both personnel and communications tech, so even a massive set of assassinations on the line of succession and a shuttering of internet infrastructure wouldn’t prevent a retaliatory order from being issued through EMP resistant satellites or a SAOC. More realistically, you’d use a decapitation strike to suppress decision making for a few minutes or hours, buying you more time to hunt down the remaining counterforce and relax the simultaneity requirement.

WMD Defenses: Alternatively, states could try to neutralize a retaliatory strike. While this could theoretically be possible with technology that enables faster boost phase interception (e.g. much-improved DEWs or space-based interceptors) or massive increases in industrial output, there are three massive problems with defense.

Scale/cost: The U.S. in particular has repeatedly tried to invest in a comprehensive ICBM defense system (see: Brilliant Pebbles and the more recent Golden Dome). The reason these programs have repeatedly failed is that scaling them to account for rival arsenals is impossibly expensive. Midcourse interception systems, like Aegis or GMD, cannot distinguish between decoys and warheads in the threat cloud and so must bleed interceptors to compensate. And although boost-phase targeting systems have the advantage of tracking a relatively slow, soft, and single target in the initial rocket, the fact that the defender cannot know where the rockets will be launched from forces them to pre-position space-based interceptors across the entire planet to compensate. It is therefore extremely easy to saturate by launching a large salvo from a small number of locations. For example, the U.S. would need to field more than 1,600 interceptors to reliably destroy a single North Korean Hwasong-18, and many times that amount for a modern ICBM with a faster boost.
Construction time: Even supposing that a state could afford the defensive infrastructure that would be used to counter a missile strike, it would take years to fully implement. Even the Trump admin’s own (notably generous) estimate of the Golden Dome’s construction time is three years—more than enough time for a rival power to invest in scaling their warhead count or to sabotage the unfinished project.
Non-missile coverage: Finally, states have the problem of accounting for non-missile means of delivery. Even if every ICBM could be reliably intercepted, nukes could still be delivered through coastal torpedoes, stealth bombers, or even smuggled into the country and pre-positioned. And if a state were truly desperate, it could resort to extreme fail-deadlies to maintain deterrence, like a massive salted bomb safely detonated from the homeland, an engineered bioweapon, or other uncontainable symmetric weapons. Nuclear weapons are an efficient and targetable WMD, but they are by no means the only deterrent a determined state could have access to.

Still, nuclear defenses don’t need to succeed on their own: they only need to be successful enough to mop up the defender’s surviving missiles against an initial strike. Even though I find it unlikely that a state would be able to simultaneously destroy all major and satellite launch nodes, it seems plausible to destroy a large enough percentage to make a combined effort successful.

Escalation management: States could also be less obviously disempowered by salami-slicing and persuasion. Rather than try to outright destroy or neutralize a rival’s nuclear deterrent, a state with a massive technological and industrial lead could simply invest in building up its coercive leverage, then using it to demand individual concessions. If the U.S. wanted to push for Taiwan’s independence from China, for example, it could use its AI surplus to incrementally achieve a massive conventional military overmatch, and use sophisticated propaganda to push for an elite consensus that war with the U.S. over Taiwan would be unwinnable and result in an embarrassing defeat. Similarly, (individually deniable) automated grey zone attacks could be used to attack rival industrial output, economic growth, and military R&D, allowing the leading nation(s) to further compound their relative advantage until they reach a point of strategic dominance. Even though the U.S. never militarily defeated the Soviet Union, it’s economic advantage allowed it to maintain an extremely costly arms race with its rival, the economic pressure of which eventually contributed to its political collapse.

The problem with this strategy is that it’s very difficult to predict at what point a demand stops being sub-nuclear. The decision to escalate is a function of often arbitrary perceptions about regime survival, domestic politics, and even personal honor. A leader could absorb a great deal of pain without escalating, or overreact violently to a minor provocation that happens to hit a nerve. To compensate for this uncertainty, your AI systems would therefore need to be able to both increase a state’s military capacity to disempower its rivals in a deniable way, and to be able to accurately simulate or manipulate their decision making.

That is not to say that these are impossible capabilities to have. Generally superhuman AI systems will, necessarily, be superhuman in their ability to charismatically persuade decision makers, and would allow for simultaneously massive and personalized information campaigns. What’s less obvious is whether this persuasion would be strong enough to manipulate leaders on particularly vital decisions, and whether it would be “offense-dominant” against other AI systems providing counsel and analysis of its arguments. Tentatively, I expect that superpersuasion would be very effective against an ordinary human without this assistance (given that algorithmic content is already so effective at invisibly shaping preferences), but that the defensive use of AIs for epistemics would prevent decision makers from being arbitrarily manipulated (since these systems will have higher trust and the advantage of arguing for the truth).

So, to answer the relevant question: would the U.S. be capable of undermining nuclear deterrence with a large enough lead in AI? In descending order of difficulty:

Against China: Probably not. Even if the U.S. “wins” the race to AGI, it seems unlikely that the U.S. would be able to scale its defensive or offensive systems far and quickly enough to prevent the Chinese government from being able to reactively invest in its second strike assurances. Although China might have a less developed triad than Russia and the U.S., as well as a smaller number of warheads, it has the distinct advantage of having its own domestic AI base, making it much more difficult for the U.S. to secure a decisive technological lead. In all likelihood, the Chinese government will be able to secure itself epistemically against AI persuasion, apply AI to automate its industrial base, and to invest in novel WMDs—at least to the extent that the U.S. would be unacceptably uncertain about the success of a first strike. This uncertainty would buy the Chinese government time, with which it could reinvest in its second strike assurance, which would buy yet more time, and so on until the deluge of technological innovation from advanced AI slows down.
Against Russia: This is more interesting. Russia maintains a massive and diversified set of warheads, but it also has approximately zero ability to compete in AI. Unless another country (such as China) proactively invests in its compute stock and provides advanced models, Russia’s economy and military assets will eventually become obsolete. Imagine a situation in which the U.S. and China have started tiling their interiors with self-replicating factories, explosively growing their share of the global economy. Russia’s non-nuclear influence (e.g. economic and petro) would quickly wither away, leaving it with only the binary and unreliable influence of nuclear weapons to rely on. As the historical collapse of the Soviet Union demonstrates, it’s not necessary to militarily defeat a rival to disempower them: instead, it may be sufficient to simply outgrow and outlast them until they are vulnerable to political collapse.
Small nuclear powers: The remaining states are significantly easier to disempower. All of them have significantly smaller warhead counts (making them easier to defensively saturate) and a less-developed nuclear triad than their great power peers. They’re also, for the most part, significantly less self-reliant than China and Russia, increasing the amount of non-nuclear leverage the leading states can apply (see: China’s implicit influence over Pyongyang through its control of coal and food imports). Even moreso than Russia, these countries are at long term risk of becoming vassal states purely through economic obsolescence, and are significantly more susceptible to a disarming strike.

Overall, I expect that conventional nuclear deterrence will primarily serve as a means to buy time for a state to advance its own AI capabilities and to diversify its second strike assurances accordingly. If a nuclear state has no capacity to deploy or develop AI, then this time will not be useful, and it will eventually be destroyed through a combination of advanced technology and industrial attrition.