A society that relies on AI to manage its power grids, financial markets, or supply chains is inherently fragile if the human operators lack the embodied skills to manage a crisis without the AI’s assistance.
I don’t understand this. If the AI becomes truly general, then the AI, and not the humans, will have the skills necessary to avert the crisis. Did you mean something like titotal’s Slopworld? Or that a society dependent on the AGIs will have less value than a pure human society? If the latter, then I would fully agree.
I also don’t understand how degradation of human expertise could lead to subverting RL. There are domains where one can use RL with no human feedback at all, like math, coding or technical sciences.
I somewhat do agree with “a society dependent on the AGIs will have less value than a pure human society”.
With respect to subverting RL, I meant subverting RL to prevent us from aligning them without us even realizing. This is plausible since it is us who we can ultimately trust (rather than another model) to have the maximum coverage (since we are more creative), and more alignment oriented (since might be the ones at risk) to label safety-alignment data.
I don’t understand this. If the AI becomes truly general, then the AI, and not the humans, will have the skills necessary to avert the crisis. Did you mean something like titotal’s Slopworld? Or that a society dependent on the AGIs will have less value than a pure human society? If the latter, then I would fully agree.
I also don’t understand how degradation of human expertise could lead to subverting RL. There are domains where one can use RL with no human feedback at all, like math, coding or technical sciences.
I somewhat do agree with “a society dependent on the AGIs will have less value than a pure human society”.
With respect to subverting RL, I meant subverting RL to prevent us from aligning them without us even realizing. This is plausible since it is us who we can ultimately trust (rather than another model) to have the maximum coverage (since we are more creative), and more alignment oriented (since might be the ones at risk) to label safety-alignment data.