Nice overview :) One point: the introductory sentences don’t seem to match the content.
It is clear to most AI safety researchers that the idea of “human values” is underdefined, and this concept should be additionally formalized before it can be used in (mostly mathematical) models of AI alignment.
In particular, I don’t interpret most of the researchers you listed as claiming that “[human values] should be formalized”. I think that’s a significantly stronger claim than, for example, the claim that we should try to understand human values better.
Nice overview :) One point: the introductory sentences don’t seem to match the content.
In particular, I don’t interpret most of the researchers you listed as claiming that “[human values] should be formalized”. I think that’s a significantly stronger claim than, for example, the claim that we should try to understand human values better.