TAG comments on Questions for old LW members: how have discussions about AI changed compared to 10+ years ago?

TAG 29 Apr 2025 17:10 UTC
2 points
1
“Alignment” is badly defined. It’s ambiguous between “making AI safe, in anyway whatsoever” and “make Agentive AI safe by making it share human values” (As opposed to some other way, such as Control, or building only non Agentive AIs).

Alignment with “human values” is probably difficult, not least because “human value” isn’t a well defined target. We can’t define it as the output of CEV, because CEV isn’t a method that’s defined mechanically.