Mitchell_Porter comments on Do you feel that AGI Alignment could be achieved in a Type 0 civilization?

Mitchell_Porter 6 Jul 2023 8:28 UTC
3 points
0
The most common assumption on Less Wrong has been that once AGI is created, it will swiftly become superintelligent, and humans will lose all power over it. Under this paradigm, for an aligned future to occur, it is necessary and sufficient that the first self-enhancing AGI is aligned, and retains that alignment as it becomes superintelligent. If it is aligned and remains aligned, it will build an aligned future, no matter what troubles humans might create. And similarly, if it is unaligned or becomes unaligned, no amount of human effort or ingenuity will be able to compensate for that.
Under this paradigm, AGI alignment has a chance of being achieved in worlds that are arbitrarily misguided or dystopian, so long as the first self-enhancing AGI turns out to be properly aligned.
This is the paradigm associated with Eliezer Yudkowsky. A contrasting paradigm, associated with Robin Hanson, is that social structures, such as politics, economy, or culture, will always have more power than any individual intelligence. In that scenario, social arrangements continue to matter for alignment, even after AGI is created, and not just before.