“Alignment” is badly defined. It’s ambiguous between “making AI safe, in anyway whatsoever” and “make Agentive AI safe by making it share human values” (As opposed to some other way, such as Control, or building only non Agentive AIs).
Alignment with “human values” is probably difficult, not least because “human value” isn’t a well defined target. We can’t define it as the output of CEV, because CEV isn’t a method that’s defined mechanically.
“Alignment” is badly defined. It’s ambiguous between “making AI safe, in anyway whatsoever” and “make Agentive AI safe by making it share human values” (As opposed to some other way, such as Control, or building only non Agentive AIs).
Alignment with “human values” is probably difficult, not least because “human value” isn’t a well defined target. We can’t define it as the output of CEV, because CEV isn’t a method that’s defined mechanically.