I think a slightly more general version of this question, referring to human values rather than specifically CEV, is maybe a fairly important point.
If we want a system to fulfill our best wishes it needs to learn what they are based on its models of us, and if too few of us spend time trying to work out what we want in an ideal world then the dataset it’s working from with be impoverished, perhaps to the point of causing problems.
I think addressing this is less pressing than other parts of the alignment problem, because it’s plausible that we can punt it to after the intelligence explosion, but it would maybe be nice to have some project started to collect information about idealized human values.
I think a slightly more general version of this question, referring to human values rather than specifically CEV, is maybe a fairly important point.
If we want a system to fulfill our best wishes it needs to learn what they are based on its models of us, and if too few of us spend time trying to work out what we want in an ideal world then the dataset it’s working from with be impoverished, perhaps to the point of causing problems.
I think addressing this is less pressing than other parts of the alignment problem, because it’s plausible that we can punt it to after the intelligence explosion, but it would maybe be nice to have some project started to collect information about idealized human values.