Early on you mention physical attacks to destroy offline backups; these attacks would be highly visible and would contradict the dark forest nature the scenario.
Perfect concealment and perfect attacks are in tension. The AI supposedly knows the structure and vulnerabilities of the systems hosting an enemy AI, but finding these things out for sure requires intrusion, which can be detected. The AI can hold off on attacking and work off of suppositions, but then the perfect attack is not guaranteed, and the attack could fail due to unknowns.
Other notes:
Why do you assume that AIs bias towards perfect, deniable strikes? An AI that strikes first can secure an early advantage; for example, if it can knock out all running copies of an enemy AI, restoring from backups will take time and leave the enemy AI vulnerable. As another example, if AI Alpha knows it is less capable than AI Bravo, but that AI Bravo will wait to attack it perfectly, AI Alpha attacking first (imperfectly) can force AI Bravo to abandon all previous attack preparations to defend itself (see maneuver warfare).
“Defend itself” might be better put as re-taking and re-securing compromised systems; relatedly, I think cybersecurity defense is much less of an active action than this analysis seems to assume?
An extension of your game theory analysis implies that the US should have nuked the USSR in the 1950s, and should have been nuking all other nuclear nations over the last 70 years. This seems weird? At least, I expect it to not be persuasive to folks thinking about AI society.
The stylistic choice I disagree with most is the bolding: if a short paragraph has 5 different bolded statements, then… what’s the point?
The physical attacks may be highly visible, but not their source. An AGI could deploy autonomous agents with no clear connection back to it, manipulate human actors without them realising, or fabricate intelligence to create seemingly natural accidents. The AGI itself remains invisible. While this increases the visibility of an attack, it does not expose the AGI. It wouldn’t be a visible war—more like isolated acts of sabotage. Good point to raise, though.
You bring up manoeuvre warfare, but that assumes AI operates under constraints similar to human militaries. The reason to prefer perfect, deniable strikes is that failure in an early war phase means immediate extinction for the weaker AGI. Imperfect attacks invite escalation and countermeasures—if AGI Alpha attacks AI Bravo first but fails, it almost guarantees its own destruction. In human history, early aggression sometimes works—Pearl Harbour, Napoleon’s campaigns—but other times it leads to total defeat—Germany in WW2, Saddam Hussein invading Kuwait. AIs wouldn’t gamble unless they had no choice. A first strike is only preferable when not attacking is clearly worse. Of course, if an AGI assesses that waiting for the perfect strike gives its opponent an insurmountable edge, it may attack earlier, even if imperfectly. But unless forced, it will always prioritise invisibility.
This difference in strategic incentives is why AGI war operates under a different logic than human conflicts, including nuclear deterrence. The issue with the US nuking other nations is that nuclear war is catastrophically costly—even for the “winner.” Beyond the direct financial burden, it leads to environmental destruction, diplomatic fallout, and increased existential risk. The deterrent is that nobody truly wins. An AGI war is entirely different: there is no environmental, economic, or social cost—only a resource cost, which is negligible for an AGI. More importantly, eliminating competition provides a definitive strategic advantage with no downside. There is no equivalent to nuclear deterrence here—just a clear incentive to act first.
Bolding helps emphasise key points for skimmers, which is a large portion of online readers. If I could trust people to read every word deeply, I wouldn’t use it as much. When I compile my essays into a book, I’ll likely reduce its use, as book readers engage differently. In a setting like this, however, where people often scan posts before committing, bolding increases retention and ensures critical takeaways aren’t missed.
Some logical nits:
Early on you mention physical attacks to destroy offline backups; these attacks would be highly visible and would contradict the dark forest nature the scenario.
Perfect concealment and perfect attacks are in tension. The AI supposedly knows the structure and vulnerabilities of the systems hosting an enemy AI, but finding these things out for sure requires intrusion, which can be detected. The AI can hold off on attacking and work off of suppositions, but then the perfect attack is not guaranteed, and the attack could fail due to unknowns.
Other notes:
Why do you assume that AIs bias towards perfect, deniable strikes? An AI that strikes first can secure an early advantage; for example, if it can knock out all running copies of an enemy AI, restoring from backups will take time and leave the enemy AI vulnerable. As another example, if AI Alpha knows it is less capable than AI Bravo, but that AI Bravo will wait to attack it perfectly, AI Alpha attacking first (imperfectly) can force AI Bravo to abandon all previous attack preparations to defend itself (see maneuver warfare).
“Defend itself” might be better put as re-taking and re-securing compromised systems; relatedly, I think cybersecurity defense is much less of an active action than this analysis seems to assume?
An extension of your game theory analysis implies that the US should have nuked the USSR in the 1950s, and should have been nuking all other nuclear nations over the last 70 years. This seems weird? At least, I expect it to not be persuasive to folks thinking about AI society.
The stylistic choice I disagree with most is the bolding: if a short paragraph has 5 different bolded statements, then… what’s the point?
The physical attacks may be highly visible, but not their source. An AGI could deploy autonomous agents with no clear connection back to it, manipulate human actors without them realising, or fabricate intelligence to create seemingly natural accidents. The AGI itself remains invisible. While this increases the visibility of an attack, it does not expose the AGI. It wouldn’t be a visible war—more like isolated acts of sabotage. Good point to raise, though.
You bring up manoeuvre warfare, but that assumes AI operates under constraints similar to human militaries. The reason to prefer perfect, deniable strikes is that failure in an early war phase means immediate extinction for the weaker AGI. Imperfect attacks invite escalation and countermeasures—if AGI Alpha attacks AI Bravo first but fails, it almost guarantees its own destruction. In human history, early aggression sometimes works—Pearl Harbour, Napoleon’s campaigns—but other times it leads to total defeat—Germany in WW2, Saddam Hussein invading Kuwait. AIs wouldn’t gamble unless they had no choice. A first strike is only preferable when not attacking is clearly worse. Of course, if an AGI assesses that waiting for the perfect strike gives its opponent an insurmountable edge, it may attack earlier, even if imperfectly. But unless forced, it will always prioritise invisibility.
This difference in strategic incentives is why AGI war operates under a different logic than human conflicts, including nuclear deterrence. The issue with the US nuking other nations is that nuclear war is catastrophically costly—even for the “winner.” Beyond the direct financial burden, it leads to environmental destruction, diplomatic fallout, and increased existential risk. The deterrent is that nobody truly wins. An AGI war is entirely different: there is no environmental, economic, or social cost—only a resource cost, which is negligible for an AGI. More importantly, eliminating competition provides a definitive strategic advantage with no downside. There is no equivalent to nuclear deterrence here—just a clear incentive to act first.
Bolding helps emphasise key points for skimmers, which is a large portion of online readers. If I could trust people to read every word deeply, I wouldn’t use it as much. When I compile my essays into a book, I’ll likely reduce its use, as book readers engage differently. In a setting like this, however, where people often scan posts before committing, bolding increases retention and ensures critical takeaways aren’t missed.