One person’s thing is called “extrapolated volition”. The “coherent” part is for when you combine extrapolated volitions of many people.
All of the cohering that individuals have to do is fully resolved by the extrapolation part (in particular, e.g., via pointing out to them/their idealized selves any incoherencies and asking them how they should be resolved).
E.g., as an example (I think from Arbital?) of where there can be multiple reflectively consistent extrapolations, maybe if someone valued the feeling of heat in their mouth without knowing that it corresponds to either spiciness or warmness, upon learning that heat was not ontologically basic, they can value any of {temperature-hotness, spiciness, both, neither}. They might go through motions like “which value would I have acquired instead, have I known this when things led to me valuing heat in my mouth”; they might end up wanting to express their preferences as some combination of those, running different extrapolations and assigning some % to them; but all of this is determined by the part where we’re asking how they want to be extrapolated and how their wishes should be interpreted, the process of cohering them is a choice that’s not ours to make.
So I think it’s quite an important distinction, and I also feel like extrapolated volition and CEV are terms reserved for their original use by Yudkowsky.
If you go back and check, you will find that I never said that extrapolating human morality gives you a single outcome. Be very careful about attributing ideas to me on the basis that others attack me as having them. “The “Coherent” in “Coherent Extrapolated Volition” does not indicate the idea that an extrapolated volition is necessarily coherent. The “Coherent” part indicates the idea that if you build an FAI and run it on an extrapolated human, the FAI should only act on the coherent parts. Where there are multiple attractors, the FAI should hold satisficing avenues open, not try to decide itself.”—Eliezer Yudkowsky
(Crossposting from Twitter.)
One person’s thing is called “extrapolated volition”. The “coherent” part is for when you combine extrapolated volitions of many people.
All of the cohering that individuals have to do is fully resolved by the extrapolation part (in particular, e.g., via pointing out to them/their idealized selves any incoherencies and asking them how they should be resolved).
E.g., as an example (I think from Arbital?) of where there can be multiple reflectively consistent extrapolations, maybe if someone valued the feeling of heat in their mouth without knowing that it corresponds to either spiciness or warmness, upon learning that heat was not ontologically basic, they can value any of {temperature-hotness, spiciness, both, neither}. They might go through motions like “which value would I have acquired instead, have I known this when things led to me valuing heat in my mouth”; they might end up wanting to express their preferences as some combination of those, running different extrapolations and assigning some % to them; but all of this is determined by the part where we’re asking how they want to be extrapolated and how their wishes should be interpreted, the process of cohering them is a choice that’s not ours to make.
So I think it’s quite an important distinction, and I also feel like extrapolated volition and CEV are terms reserved for their original use by Yudkowsky.
While the “coherent” part is predominantly about combining EVs, it’s not solely about that, according to Yudkowsky. Via Coherent Extrapolated Volition, original source this comment from August 2008