This mostly seems to be an argument for: “It’d be nice if no pivotal act is necessary”, but I don’t think anyone disagrees with that.
As for “Should an AGI company be doing this?” the obvious answer is “It depends on the situation”. It’s clearly nice if it’s not necessary. Similarly, if [the world does the enforcement] has higher odds of success than [the AGI org does the enforcement] then it’s clearly preferable—but it’s not clear that would be the case.
I think it’s rather missing the point to call it a “pivotal act philosophy” as if anyone values pivotal acts for their own sake. Some people just think they’re plausibly necessary—as are many unpleasant and undesirable acts. Obviously this doesn’t imply they should be treated lightly, or that the full range of more palatable options shouldn’t be carefully considered,
I don’t buy that an intention to perform pivotal acts is a significant race-dynamic factor: incentives to race seem over-determined already. If we could stop the existing race, I imagine most pivotal-act advocates would think a pivotal act were much less likely to be necessary.
Depending on the form an aligned AGI takes, it’s also not clear that the developing organisation gets to decide/control what it does. Given that special-casing avoidance of every negative side-effect is a non-starter, an aligned AGI will likely need a very general avoids-negative-side-effects mechanism. It’s not clear to me that an aligned AGI that knowingly permits significant avoidable existential risk (without some huge compensatory upside) is a coherent concept.
If you’re allowing a [the end of the world] side-effect, what exactly are you avoiding, and on what basis? As soon as your AGI takes on any large-scale long-term task, then [the end of the world] is likely to lead to a poor outcome on that task, and [prevent the end of the world] becomes an instrumental goal.
Forms of AGI that just do the pivotal act, whatever the creators might think about it, are at least plausible. I assume this will be an obvious possibility for other labs to consider in planning.
This mostly seems to be an argument for: “It’d be nice if no pivotal act is necessary”, but I don’t think anyone disagrees with that.
It’s arguing that, given that your organization has scary (near) AGI capabilities, it is not so much harder (to get a legitimate authority to impose an off-switch on the world’s compute) than (to ‘manufacture your own authority’ to impose that off-switch) such that it’s worth avoiding the cost of (developing those capabilities while planning to manufacture authority). Obviously there can be civilizations where that’s true, and civilizations where that’s not true.
This mostly seems to be an argument for: “It’d be nice if no pivotal act is necessary”, but I don’t think anyone disagrees with that.
As for “Should an AGI company be doing this?” the obvious answer is “It depends on the situation”. It’s clearly nice if it’s not necessary. Similarly, if [the world does the enforcement] has higher odds of success than [the AGI org does the enforcement] then it’s clearly preferable—but it’s not clear that would be the case.
I think it’s rather missing the point to call it a “pivotal act philosophy” as if anyone values pivotal acts for their own sake. Some people just think they’re plausibly necessary—as are many unpleasant and undesirable acts. Obviously this doesn’t imply they should be treated lightly, or that the full range of more palatable options shouldn’t be carefully considered,
I don’t buy that an intention to perform pivotal acts is a significant race-dynamic factor: incentives to race seem over-determined already. If we could stop the existing race, I imagine most pivotal-act advocates would think a pivotal act were much less likely to be necessary.
Depending on the form an aligned AGI takes, it’s also not clear that the developing organisation gets to decide/control what it does. Given that special-casing avoidance of every negative side-effect is a non-starter, an aligned AGI will likely need a very general avoids-negative-side-effects mechanism. It’s not clear to me that an aligned AGI that knowingly permits significant avoidable existential risk (without some huge compensatory upside) is a coherent concept.
If you’re allowing a [the end of the world] side-effect, what exactly are you avoiding, and on what basis? As soon as your AGI takes on any large-scale long-term task, then [the end of the world] is likely to lead to a poor outcome on that task, and [prevent the end of the world] becomes an instrumental goal.
Forms of AGI that just do the pivotal act, whatever the creators might think about it, are at least plausible.
I assume this will be an obvious possibility for other labs to consider in planning.
It’s arguing that, given that your organization has scary (near) AGI capabilities, it is not so much harder (to get a legitimate authority to impose an off-switch on the world’s compute) than (to ‘manufacture your own authority’ to impose that off-switch) such that it’s worth avoiding the cost of (developing those capabilities while planning to manufacture authority). Obviously there can be civilizations where that’s true, and civilizations where that’s not true.