I: It was 21 years between when North Korea signed the nonproliferation treaty and their first nuclear test. And they were very motivated. Seems to me like the treaty actually did something?
The international community was limited to “expressing concern” only because Russia had nukes. For the current war, their interventions have gone far beyond blocking some bank cards. Large amounts of material support for the war doesn’t seem like “no military intervention” to me. Also, nobody believes the Ukraine war represents an existential threat to humanity; if they did, I think you’d see quite a lot more intervention.
II: Different payoff structures “as most people perceive it”? Most people thinking about AI and national security only see it as an issue of getting a military edge via autonomous weapons and mass surveillance. They do not actually think AI progress could lead to something that can function as a successor species. If they did, I think they would be acting very differently. Getting to point where people believe that seems like a major precursor to any treaty. Also, even if they are AGI-pilled, decision makers may not have a tendency towards “And then we will take over the world with AI! Or kill everyone trying!” in their thinking, compared to the worst fears of rationalists.
III: A political consensus is different from a scientific consensus. National security types may have a considerably different reaction to possibilities of doom than most ML researchers.
IV: I don’t think TACO is a sufficient reason to argue that we are actually incapable of enforcing treaties. Treaties still exist, and often are still being at least partially enforced. Iran sanctions happened for decades and continue to happen. But I do agree that a treaty between nuclear powers with adversarial incentives to defect is a hard problem.
V: Are you sure it’s actually replaceable by geographically diffuse CPU clusters? Isn’t part of the whole reason you have to do datacenters because you need low latency? What % of global CPUs would you need to replicate frontier training efforts? If it’s even close to 1%, that seems like a very detectable datacenter to me.
Even if it is replaceable, somehow—OK—should that be a thoughtstopper? Is there no way to gain traction on the problem?
Though I do think limiting algorithmic efficiency improvement by treaty is something we should be concerned about. Are there bottlenecks on that? Do those require large amount of compute to obtain, or validate? Could a cultural norm against algorithmic improvement be inculcated among scientists? Again, just because something seems hard doesn’t mean it’s impossible.
There’s also the danger that one of the various other AGI efforts outside of LLMs that don’t need large amounts of resources might pay off. That seems pretty scary.
VI: I don’t think a flawed treaty only buys us two years? Given how long it would take China to catch up to us if we stopped, it buys us at least a decade. Unless a different state—Russia, India, UAE? - is willing to buy up all our scientists, buy up a whole lot of compute, and start and sustain their own program for a decade, while no one does anything about it?
This also seems to assume that such programs would be undetectable. But the whole point of the treaty proposal is that massive quantities of compute would be traceable. I suppose if a state sponsored AGI research program were to look for methods that were much less compute intense, that would be pretty scary yes.
It also depends on what you think the relative outcomes are. What if a treaty increases the chance of s-risk from autocracies by 1%, but decreases the chance of s-risk from unaligned ASI from the current AI race by 5%?
Re: your “better way”—this basically rounds out to “massively increase technical safety research.” Yet is safety research safe? This is a whole separate issue, but most safety research tends to scale capabilities as much as or more than it scales safety. RLHF was safety research, and we got ChatGPT out of it. If we’re giving compute and money to anyone who can put the word “safety” in a grant proposal, I don’t think we’re going to actually get differentially safe safety research.
Meanwhile, policy and advocacy have gotten 10-50x less funding than technical safety research has. If you think the solutions we have available for governance are so unworkable, maybe we just have not tried hard enough?
Going point by point:
I: It was 21 years between when North Korea signed the nonproliferation treaty and their first nuclear test. And they were very motivated. Seems to me like the treaty actually did something?
The international community was limited to “expressing concern” only because Russia had nukes. For the current war, their interventions have gone far beyond blocking some bank cards. Large amounts of material support for the war doesn’t seem like “no military intervention” to me. Also, nobody believes the Ukraine war represents an existential threat to humanity; if they did, I think you’d see quite a lot more intervention.
II: Different payoff structures “as most people perceive it”? Most people thinking about AI and national security only see it as an issue of getting a military edge via autonomous weapons and mass surveillance. They do not actually think AI progress could lead to something that can function as a successor species. If they did, I think they would be acting very differently. Getting to point where people believe that seems like a major precursor to any treaty. Also, even if they are AGI-pilled, decision makers may not have a tendency towards “And then we will take over the world with AI! Or kill everyone trying!” in their thinking, compared to the worst fears of rationalists.
III: A political consensus is different from a scientific consensus. National security types may have a considerably different reaction to possibilities of doom than most ML researchers.
IV: I don’t think TACO is a sufficient reason to argue that we are actually incapable of enforcing treaties. Treaties still exist, and often are still being at least partially enforced. Iran sanctions happened for decades and continue to happen. But I do agree that a treaty between nuclear powers with adversarial incentives to defect is a hard problem.
V: Are you sure it’s actually replaceable by geographically diffuse CPU clusters? Isn’t part of the whole reason you have to do datacenters because you need low latency? What % of global CPUs would you need to replicate frontier training efforts? If it’s even close to 1%, that seems like a very detectable datacenter to me.
Even if it is replaceable, somehow—OK—should that be a thoughtstopper? Is there no way to gain traction on the problem?
Though I do think limiting algorithmic efficiency improvement by treaty is something we should be concerned about. Are there bottlenecks on that? Do those require large amount of compute to obtain, or validate? Could a cultural norm against algorithmic improvement be inculcated among scientists? Again, just because something seems hard doesn’t mean it’s impossible.
There’s also the danger that one of the various other AGI efforts outside of LLMs that don’t need large amounts of resources might pay off. That seems pretty scary.
VI: I don’t think a flawed treaty only buys us two years? Given how long it would take China to catch up to us if we stopped, it buys us at least a decade. Unless a different state—Russia, India, UAE? - is willing to buy up all our scientists, buy up a whole lot of compute, and start and sustain their own program for a decade, while no one does anything about it?
This also seems to assume that such programs would be undetectable. But the whole point of the treaty proposal is that massive quantities of compute would be traceable. I suppose if a state sponsored AGI research program were to look for methods that were much less compute intense, that would be pretty scary yes.
It also depends on what you think the relative outcomes are. What if a treaty increases the chance of s-risk from autocracies by 1%, but decreases the chance of s-risk from unaligned ASI from the current AI race by 5%?
Re: your “better way”—this basically rounds out to “massively increase technical safety research.” Yet is safety research safe? This is a whole separate issue, but most safety research tends to scale capabilities as much as or more than it scales safety. RLHF was safety research, and we got ChatGPT out of it. If we’re giving compute and money to anyone who can put the word “safety” in a grant proposal, I don’t think we’re going to actually get differentially safe safety research.
Meanwhile, policy and advocacy have gotten 10-50x less funding than technical safety research has. If you think the solutions we have available for governance are so unworkable, maybe we just have not tried hard enough?