Other (fun?) cases to think about are whether you’d rather take your chances on the individual CEV of:
someone high on sociopathy, dark triad, or other non-neurotypical traits
someone who was severely traumatized as a child or adult, e.g. a prisoner of war or abuse victim
a Buddhist monk or nun who has lived a a life of extreme asceticism
an enthusiastic meth or heroin addict
someone very committed / deep into woo or spirituality (as part of an organized religion or not)
I currently think that being neurotypical along certain dimensions, non-traumatized, and not in some extreme[1] corner of spirituality / woo is probably somewhat more important (to producing good outcomes for other humans) vs. how morally good the extrapolee currently is (according to my values), as revealed by their words and actions.
Extrapolating the preferences of an underdeveloped entity (e.g. an animal or a baby) likely leaves most of the important bits unspecified and thus up to the extrapolater, and / or result in nothing of recognizable value to adult humans. And I agree that extrapolating the preferences of an LLM is much more likely to produce something very weird and likely valueless to humans.
Other (fun?) cases to think about are whether you’d rather take your chances on the individual CEV of:
someone high on sociopathy, dark triad, or other non-neurotypical traits
someone who was severely traumatized as a child or adult, e.g. a prisoner of war or abuse victim
a Buddhist monk or nun who has lived a a life of extreme asceticism
an enthusiastic meth or heroin addict
someone very committed / deep into woo or spirituality (as part of an organized religion or not)
I currently think that being neurotypical along certain dimensions, non-traumatized, and not in some extreme[1] corner of spirituality / woo is probably somewhat more important (to producing good outcomes for other humans) vs. how morally good the extrapolee currently is (according to my values), as revealed by their words and actions.
Extrapolating the preferences of an underdeveloped entity (e.g. an animal or a baby) likely leaves most of the important bits unspecified and thus up to the extrapolater, and / or result in nothing of recognizable value to adult humans. And I agree that extrapolating the preferences of an LLM is much more likely to produce something very weird and likely valueless to humans.
i.e. most ordinary religious people of any religion would be fine, if not totally optimal