Humanity can be extremely unserious about doom—it is frightening how many gambles were made during the cold war: the US had some breakdown in communication such that they planned to defend Europe with massive nuclear strikes at a point in time where they only had a few nukes that were barely ready, there were many near misses, hierarchies often hid how bad the security of nukes was—resulting in inadequate systems and lost nukes, etc.
I was most surprised to see how we almost didn’t have a nuclear taboo, according to both books, this is something that was actively debated post-WW2!
But how nukes are handled can also help us see what it looks like to be actually serious:
It is possible to spend billions building security systems, e.g. applying the 2-person rule and installing codes in hundreds of silos
even when these reduce how efficient the nuclear arsenal is—e.g. because you have tradeoffs between how reliable a nuclear weapon is when you decide to trigger it, and how reliably it does not trigger when you decide to not trigger it (similar to usefulness vs safety tradeoffs in control scaffolds)
(the deployment of safety measures was slower than would be ideal and was in part driven by incidents, but was more consequential than current investments in securing AIs)
It is possible to escalate concerns about the risk of certain deployments (like Airborne alerts) up to the President, and get them cancelled (though it might require the urgency of the deployment to not be too high)
It is possible to have major international agreements (e.g. test ban treaties)
Technical safety is contingent and probably matters: technical measures like 1-point safety (which was almost measured incorrectly!) or Strong link/weak link probably avoided some accidental nuclear explosions (which might have triggered a nuclear war), required non-trivial technical insights and quiet heroes to push them through
Abd you’ll probably never hear about technical safety measures! (I never heard of these ones before listening to Command and Control)
I suspect it might be similar for AI x-risk safety
Fiction, individual responsibility, public pressure are powerful forces.
Red alert was one of the key elements that made people more willing to spend on avoiding accidental nuclear war.
Some of the mitigations may not have been implemented if the person pushing for it hadn’t sent a written report to some higher-up such that the higher-up would have been blamed if something bad had happened.
Pressures from journalists around nuclear accidents also greatly helped. Some nuclear scientists regretted not sharing more about the risk of nuclear accidents with the public.
The game theory of war is complex and contingent on
random aspects of the available technology:
Mass mobilization is a technology that makes it hard to avoid conflicts like WW1: If you mobilize, de-mobilizing makes you weak against counter-mobilization, and if your adversary does not mobilize in time you can crush them before they had a chance to mobilize—making the situation very explosive and without many chances to back down.
Nuclear war is a technology that makes negotiations during the war more difficult and negotiations before the war more important, as it enables inflicting large amounts of damage extremely quickly and with relatively low uncertainty. It does not easily enable inflicting damages slowly (you can’t easily launch a nuke that will cause small amounts of damage over time or lots of damage later unless some agreement is reached).
Things could change a lot when the AI race becomes an important aspect of military strategy—reasoning from induction based on the last 80 years will likely not be very useful, similar to how naive induction from 1865-1945 is not very helpful to anticipate the dynamics of the Cold War.
Dynamics might be especially bad because it has large amounts of uncertainty, quick evolution of relative power, and might inflict damages too quickly for human negotiations. There is no theorem that says we’ll get some Pareto-optimal outcome, we failed to get those many times in history.
norms and available commitment mechanisms:
Brinksmanship: If you can reliably commit to bringing you and your opponent close to the brink, and make it sufficiently obvious that only your opponent can move you away from the brink (reducing the ongoing risk of mass damage) by backing down, you can get concessions as long as both sides suffer sufficiently from a war.
Similar dynamics might exist around AGI, where there might be incentives to push the capabilities frontier forward (with some ongoing risk from the unknown point where capabilities turn deadly) unless the opponent makes some kind of concession
There is a big “preventing someone from doing something you don’t want” (deterrence) vs “making someone do something you want” (compellence) distinction in current norms—given current norms, nuclear deterrence is quite effective while nuclear compellence is basically non-existent.
There are important forces pushing for automating some important war decisions. The USSR built a system (never activated) that would automatically nuke the US if a nuclear attack was detected. Obviously this is very risky, like giving AIs the nuclear button, since the system might trigger even when it is not supposed to, but there are maybe good reasons for it that may bite us in the case of AI in the military:
Distributing power to many humans is very dangerous when it comes to things like nuclear weapons, as a single crazy individual might start a nuclear war (similar to how some plane crashes are intentionally caused by depressed pilots).
Concentrating power in the hands of few individuals is both scary from a concentration-of-power perspective, but also because when victim of a nuclear first strike, the people in power might be killed or their communication with troops might be cut.
Automation can give you a way to avoid disloyalty, variance from crazy individuals, arbitrary decisions by a few individuals in a few minutes, while being distributed and robust to first strikes. But it can also increase the risk of accidents that no human wants.
Just because the situation looks very brittle doesn’t mean it’s doomed. I think I would have been very doomy about the risk of intentional or accidental nuclear war if I had known all that I know now about the security of nuclear weapons and the dynamics of nuclear war but without knowing we did not have a nuclear war in 80 years.
Though with AI the dynamics are different because we are facing a new intelligent species, so I am not sure absence of nuclear war is a very useful reference class.
I’d be keen to talk to people who worked on technical nuclear safety, I would guess that some of the knowledge of how to handle uncertain risk and prioritization might transfer!
Humanity can be extremely unserious about doom—it is frightening how many gambles were made during the cold war: the US had some breakdown in communication such that they planned to defend Europe with massive nuclear strikes at a point in time where they only had a few nukes that were barely ready, there were many near misses, hierarchies often hid how bad the security of nukes was—resulting in inadequate systems and lost nukes, etc.
It gets much worse than this. I’ve been reading through Ellsberg’s recollections about being a nuclear war planner for the Kennedy administration, and its striking just how many people had effectively unilateral launch authority. The idea that the president is the only person that can launch a nuke has never really been true, but it was especially clear back in the 50s and 60s, when we used to routinely delegate launch authority to commanders in the field. Hell, MacArthur’s plan to win in Korea would have involved nuking the north so severely that it would be impossible for China to send reinforcements, since they’d have to cross through hundreds of miles of irradiated soil.
And this is just in America. Every nuclear state has had (and likely continues to have) its own version of this emergency delegation. What’s to prevent a high ranking Pakistani or North Korean general from taking advantage of the same weaknesses?
My takeaway from this vis-a-vis ASI is that a) having a transparent, distributed chain of command with lots of friction is important, and b) that the fewer of these chains of command have to exist, the better.
I listened to the books Arms and Influence (Schelling, 1966) and Command and Control (Schlosser, 2013). They describe dynamics around nuclear war and the safety of nuclear weapons. I think what happened with nukes can maybe help us anticipate what may happen with AGI:
Humanity can be extremely unserious about doom—it is frightening how many gambles were made during the cold war: the US had some breakdown in communication such that they planned to defend Europe with massive nuclear strikes at a point in time where they only had a few nukes that were barely ready, there were many near misses, hierarchies often hid how bad the security of nukes was—resulting in inadequate systems and lost nukes, etc.
I was most surprised to see how we almost didn’t have a nuclear taboo, according to both books, this is something that was actively debated post-WW2!
But how nukes are handled can also help us see what it looks like to be actually serious:
It is possible to spend billions building security systems, e.g. applying the 2-person rule and installing codes in hundreds of silos
even when these reduce how efficient the nuclear arsenal is—e.g. because you have tradeoffs between how reliable a nuclear weapon is when you decide to trigger it, and how reliably it does not trigger when you decide to not trigger it (similar to usefulness vs safety tradeoffs in control scaffolds)
(the deployment of safety measures was slower than would be ideal and was in part driven by incidents, but was more consequential than current investments in securing AIs)
It is possible to escalate concerns about the risk of certain deployments (like Airborne alerts) up to the President, and get them cancelled (though it might require the urgency of the deployment to not be too high)
It is possible to have major international agreements (e.g. test ban treaties)
Technical safety is contingent and probably matters: technical measures like 1-point safety (which was almost measured incorrectly!) or Strong link/weak link probably avoided some accidental nuclear explosions (which might have triggered a nuclear war), required non-trivial technical insights and quiet heroes to push them through
Abd you’ll probably never hear about technical safety measures! (I never heard of these ones before listening to Command and Control)
I suspect it might be similar for AI x-risk safety
Fiction, individual responsibility, public pressure are powerful forces.
Red alert was one of the key elements that made people more willing to spend on avoiding accidental nuclear war.
Some of the mitigations may not have been implemented if the person pushing for it hadn’t sent a written report to some higher-up such that the higher-up would have been blamed if something bad had happened.
Pressures from journalists around nuclear accidents also greatly helped. Some nuclear scientists regretted not sharing more about the risk of nuclear accidents with the public.
The game theory of war is complex and contingent on
random aspects of the available technology:
Mass mobilization is a technology that makes it hard to avoid conflicts like WW1: If you mobilize, de-mobilizing makes you weak against counter-mobilization, and if your adversary does not mobilize in time you can crush them before they had a chance to mobilize—making the situation very explosive and without many chances to back down.
Nuclear war is a technology that makes negotiations during the war more difficult and negotiations before the war more important, as it enables inflicting large amounts of damage extremely quickly and with relatively low uncertainty. It does not easily enable inflicting damages slowly (you can’t easily launch a nuke that will cause small amounts of damage over time or lots of damage later unless some agreement is reached).
Things could change a lot when the AI race becomes an important aspect of military strategy—reasoning from induction based on the last 80 years will likely not be very useful, similar to how naive induction from 1865-1945 is not very helpful to anticipate the dynamics of the Cold War.
Dynamics might be especially bad because it has large amounts of uncertainty, quick evolution of relative power, and might inflict damages too quickly for human negotiations. There is no theorem that says we’ll get some Pareto-optimal outcome, we failed to get those many times in history.
norms and available commitment mechanisms:
Brinksmanship: If you can reliably commit to bringing you and your opponent close to the brink, and make it sufficiently obvious that only your opponent can move you away from the brink (reducing the ongoing risk of mass damage) by backing down, you can get concessions as long as both sides suffer sufficiently from a war.
Similar dynamics might exist around AGI, where there might be incentives to push the capabilities frontier forward (with some ongoing risk from the unknown point where capabilities turn deadly) unless the opponent makes some kind of concession
There is a big “preventing someone from doing something you don’t want” (deterrence) vs “making someone do something you want” (compellence) distinction in current norms—given current norms, nuclear deterrence is quite effective while nuclear compellence is basically non-existent.
There are important forces pushing for automating some important war decisions. The USSR built a system (never activated) that would automatically nuke the US if a nuclear attack was detected. Obviously this is very risky, like giving AIs the nuclear button, since the system might trigger even when it is not supposed to, but there are maybe good reasons for it that may bite us in the case of AI in the military:
Distributing power to many humans is very dangerous when it comes to things like nuclear weapons, as a single crazy individual might start a nuclear war (similar to how some plane crashes are intentionally caused by depressed pilots).
Concentrating power in the hands of few individuals is both scary from a concentration-of-power perspective, but also because when victim of a nuclear first strike, the people in power might be killed or their communication with troops might be cut.
Automation can give you a way to avoid disloyalty, variance from crazy individuals, arbitrary decisions by a few individuals in a few minutes, while being distributed and robust to first strikes. But it can also increase the risk of accidents that no human wants.
Just because the situation looks very brittle doesn’t mean it’s doomed. I think I would have been very doomy about the risk of intentional or accidental nuclear war if I had known all that I know now about the security of nuclear weapons and the dynamics of nuclear war but without knowing we did not have a nuclear war in 80 years.
Though with AI the dynamics are different because we are facing a new intelligent species, so I am not sure absence of nuclear war is a very useful reference class.
I’d be keen to talk to people who worked on technical nuclear safety, I would guess that some of the knowledge of how to handle uncertain risk and prioritization might transfer!
It gets much worse than this. I’ve been reading through Ellsberg’s recollections about being a nuclear war planner for the Kennedy administration, and its striking just how many people had effectively unilateral launch authority. The idea that the president is the only person that can launch a nuke has never really been true, but it was especially clear back in the 50s and 60s, when we used to routinely delegate launch authority to commanders in the field. Hell, MacArthur’s plan to win in Korea would have involved nuking the north so severely that it would be impossible for China to send reinforcements, since they’d have to cross through hundreds of miles of irradiated soil.
And this is just in America. Every nuclear state has had (and likely continues to have) its own version of this emergency delegation. What’s to prevent a high ranking Pakistani or North Korean general from taking advantage of the same weaknesses?
My takeaway from this vis-a-vis ASI is that a) having a transparent, distributed chain of command with lots of friction is important, and b) that the fewer of these chains of command have to exist, the better.
Dominic Cummings (former Chief Adviser to the UK PM) has written some things about nuclear strategy and how it’s implemented in practice. IIUC, he’s critical of (i.a.) how Schelling et al.’s game-theoretic models are (often naively/blindly) applied to the real world.