I can’t see how “publishing papers with alignment techniques” or “encouraging safe development with industry groups and policy standards” could be pivotal acts. To prevent anyone from building unaligned AI, building an unaligned AI in your garage needs to be prevented. That requires preventing people who don’t read the alignment papers or policy standards and aren’t members of the industry groups from building unaligned AI.
That, in turn, appears to me to require at least one of 1) limiting access to computation resources from your garage, 2) limiting knowledge by garage hackers of techniques to build unaligned AI, 3) somehow convincing all garage hackers not to build unaligned AI even though they could, or 4) surveillance and intervention to prevent anyone from actually building an unaligned AI even though they have the computation resources and knowledge to do it. Surveillance, under option 4, could (theoretically, I’m not saying all of these possibilities are practical) be by humans, by too-weak-to-be-dangerous AI, or by aligned AI.
“Publishing papers with alignment techniques” and “encouraging safe development with industry groups and policy standards” might well be useful actions. It doesn’t seem to me that anything like that can ever be pivotal. Building an actual aligned AI, of course, would be a pivotal act.
“Building an actual aligned AI, of course, would be a pivotal act.” What would an aligned AI do that would prevent anybody from ever building an unaligned AI?
I mostly agree with what you wrote. Preventing all unaligned AIs forever seems very difficult and cannot be guaranteed by soft influence and governance methods. These would only achieve a lower degree of reliability, perhaps constraining governments and corporations via access to compute and critical algorithms but remaining susceptible to bad actors who find loopholes in the system. I guess what I’m poking at is, does everyone here believe that the only way to prevent AI catastrophe is through power-grab pivotal acts that are way outside the Overton Window, e.g. burning all GPUs?
“Building an actual aligned AI, of course, would be a pivotal act.” What would an aligned AI do that would prevent anybody from ever building an unaligned AI?
My guess is that it would implement universal surveillance and intervene, when necessary, to directly stop people from doing just that. Sorry, I should’ve been clearer that I was talking about an aligned superintelligent AI. Since unaligned AI killing everyone seems pretty obviously extremely bad according to the vast majority of humans’ preferences, preventing that would be a very high priority for any sufficiently powerful aligned AI.
Thanks, that really clarifies things. Frankly I’m not on board with any plan to “save the world” that calls for developing AGI in order to implement universal surveillance or otherwise take over the world. Global totalitarianism dictated by a small group of all-powerful individuals is just so terrible in expectation that I’d want to take my chances on other paths to AI safety.
I’m surprised that these kinds of pivotal acts are not more openly debated as a source of s-risk and x-risk. Publish your plans, open yourselves to critique, and perhaps you’ll revise your goals. If not, you’ll still be in a position to follow your original plan. Better yet, you might convince the eventual decision makers of it.
I can’t see how “publishing papers with alignment techniques” or “encouraging safe development with industry groups and policy standards” could be pivotal acts. To prevent anyone from building unaligned AI, building an unaligned AI in your garage needs to be prevented. That requires preventing people who don’t read the alignment papers or policy standards and aren’t members of the industry groups from building unaligned AI.
That, in turn, appears to me to require at least one of 1) limiting access to computation resources from your garage, 2) limiting knowledge by garage hackers of techniques to build unaligned AI, 3) somehow convincing all garage hackers not to build unaligned AI even though they could, or 4) surveillance and intervention to prevent anyone from actually building an unaligned AI even though they have the computation resources and knowledge to do it. Surveillance, under option 4, could (theoretically, I’m not saying all of these possibilities are practical) be by humans, by too-weak-to-be-dangerous AI, or by aligned AI.
“Publishing papers with alignment techniques” and “encouraging safe development with industry groups and policy standards” might well be useful actions. It doesn’t seem to me that anything like that can ever be pivotal. Building an actual aligned AI, of course, would be a pivotal act.
“Building an actual aligned AI, of course, would be a pivotal act.” What would an aligned AI do that would prevent anybody from ever building an unaligned AI?
I mostly agree with what you wrote. Preventing all unaligned AIs forever seems very difficult and cannot be guaranteed by soft influence and governance methods. These would only achieve a lower degree of reliability, perhaps constraining governments and corporations via access to compute and critical algorithms but remaining susceptible to bad actors who find loopholes in the system. I guess what I’m poking at is, does everyone here believe that the only way to prevent AI catastrophe is through power-grab pivotal acts that are way outside the Overton Window, e.g. burning all GPUs?
My guess is that it would implement universal surveillance and intervene, when necessary, to directly stop people from doing just that. Sorry, I should’ve been clearer that I was talking about an aligned superintelligent AI. Since unaligned AI killing everyone seems pretty obviously extremely bad according to the vast majority of humans’ preferences, preventing that would be a very high priority for any sufficiently powerful aligned AI.
Thanks, that really clarifies things. Frankly I’m not on board with any plan to “save the world” that calls for developing AGI in order to implement universal surveillance or otherwise take over the world. Global totalitarianism dictated by a small group of all-powerful individuals is just so terrible in expectation that I’d want to take my chances on other paths to AI safety.
I’m surprised that these kinds of pivotal acts are not more openly debated as a source of s-risk and x-risk. Publish your plans, open yourselves to critique, and perhaps you’ll revise your goals. If not, you’ll still be in a position to follow your original plan. Better yet, you might convince the eventual decision makers of it.