Plan C: The leading AI company has a 2-9 month lead (relative to AI companies which aren’t willing to spend as much on misalignment concerns) and is sufficiently institutionally functional to actually spend this lead in a basically reasonable way (perhaps subject to some constraints from outside investors), so some decent fraction of it will be spent on safety.
TLDR: I expect it will be pretty difficult for a “Plan C Leading Lab” to stop scaling, even conditional on having a 2-9 month lead. There are enough uncertainties & Forces of Inertia that will make it difficult for a Plan C Leading Lab to know when to stop scaling, be confident enough to stop scaling, and fight the overall cultural/competitive forces that had been guiding it all along.
I expect the World C situation to be pretty complicated. Even conditional on having a 2-9 month lead, you need the leading company to (a) know that they are about to hit the Danger Threshold and (b) know how much lead they have over the 2nd place company.
In practice, this means that even if you have a 6 month lead in reality, you probably don’t get to use all of it, and there’s a good chance you don’t get to use any of it.
There is some probability that you don’t realize you’re about to hit the Danger Threshold (so you just keep scaling because of the race dynamics and the inertia and you don’t spend any “extra” time on alignment).But even if you do stop before the Danger Threshold, you don’t have full visibility into the other labs. So maybe you’re like “idk how big our lead is, uh, probably 2 months, maybe 8?”
Arguably even more important—the uncertainty makes it harder for you to stop the forces of inertia. The forces of inertia are strongly in favor of “keep scaling to beat your competitors//outcompete the other guys//go full capitalism mode” and it requires a substantial effort to steer the ship away from that inertia.
So once you factor in both sources of uncertainty (where is the Danger Threshold, where are my competitors), there’s a good chance that you just keep scaling because you’re not confident enough to initiate a massive shift away from scaling/competition/capitalism mode.
TLDR: I expect it will be pretty difficult for a “Plan C Leading Lab” to stop scaling, even conditional on having a 2-9 month lead. There are enough uncertainties & Forces of Inertia that will make it difficult for a Plan C Leading Lab to know when to stop scaling, be confident enough to stop scaling, and fight the overall cultural/competitive forces that had been guiding it all along.
I expect the World C situation to be pretty complicated. Even conditional on having a 2-9 month lead, you need the leading company to (a) know that they are about to hit the Danger Threshold and (b) know how much lead they have over the 2nd place company.
In practice, this means that even if you have a 6 month lead in reality, you probably don’t get to use all of it, and there’s a good chance you don’t get to use any of it.
There is some probability that you don’t realize you’re about to hit the Danger Threshold (so you just keep scaling because of the race dynamics and the inertia and you don’t spend any “extra” time on alignment).But even if you do stop before the Danger Threshold, you don’t have full visibility into the other labs. So maybe you’re like “idk how big our lead is, uh, probably 2 months, maybe 8?”
Arguably even more important—the uncertainty makes it harder for you to stop the forces of inertia. The forces of inertia are strongly in favor of “keep scaling to beat your competitors//outcompete the other guys//go full capitalism mode” and it requires a substantial effort to steer the ship away from that inertia.
So once you factor in both sources of uncertainty (where is the Danger Threshold, where are my competitors), there’s a good chance that you just keep scaling because you’re not confident enough to initiate a massive shift away from scaling/competition/capitalism mode.