Richard_Ngo comments on What AI Safety Researchers Have Written About the Nature of Human Values

Richard_Ngo 16 Jan 2019 15:52 UTC
2 points
0
Nice overview :) One point: the introductory sentences don’t seem to match the content.
It is clear to most AI safety researchers that the idea of “human values” is underdefined, and this concept should be additionally formalized before it can be used in (mostly mathematical) models of AI alignment.
In particular, I don’t interpret most of the researchers you listed as claiming that “[human values] should be formalized”. I think that’s a significantly stronger claim than, for example, the claim that we should try to understand human values better.