No one else seems to be giving what is IMO the correct answer; I want the values of a created FAI to match my own, extrapolated. ie moral selfishness.
I would actually prefer that the extrapolation seed be drawn only from SI supporters (or ideally just me, but that’s unlikely to fly), because I’m uneasy about what happens if some of my values turn out to be memetic, and they get swamped/outvoted by a coherent extrapolated deathist or hedonist memplex. Or if you include, for example, uplifted sharks in the process.
I too would prefer super AI to look to my values when deciding what to implement.
But, given the existence of moral disagreement, I don’t see why that deserves to be labeled Friendly. And the whole point of CEV or similar process is to figure out what is awesome for humanity. Implementing something other than what is awesome for all of humanity is not Friendly.
If deathism really is what is awesome for all humanity, I expect a FAI to implement deathism. But there’s no particular reason to believe that deathism is what is awesome for humanity.
Tim, your comment highlights the potential conflict between CEV and FAI that I also mentioned previously. FAI is by definition not hostile to human beings, whereas CEV might permit, or even require, the extinction of all humanity. This may happen, for instance, if the process of coherent extrapolation shows that humans value certain superior beings more than they value themselves, and if the coexistence of humans and these beings is impossible.
When I pointed out this problem, both Kaj Sotala and Michael Anissimov replied that CEV can never condone hostile actions towards humanity because FAI is “defined as ‘human-benefiting, non-human harming’”. However, this reply just proves my point, namely that there is a potential internal inconsistency between CEV and FAI.
Don’t look at me to resolve that conflict. I think moral extrapolation is unlikely to output anything coherent if the reference class is sufficiently large to avoid the objections I raised above. And I can’t think of any other plausible candidate to produce Friendly instructions for an AI.
No one else seems to be giving what is IMO the correct answer; I want the values of a created FAI to match my own, extrapolated. ie moral selfishness.
I would actually prefer that the extrapolation seed be drawn only from SI supporters (or ideally just me, but that’s unlikely to fly), because I’m uneasy about what happens if some of my values turn out to be memetic, and they get swamped/outvoted by a coherent extrapolated deathist or hedonist memplex. Or if you include, for example, uplifted sharks in the process.
I too would prefer super AI to look to my values when deciding what to implement.
But, given the existence of moral disagreement, I don’t see why that deserves to be labeled Friendly. And the whole point of CEV or similar process is to figure out what is awesome for humanity. Implementing something other than what is awesome for all of humanity is not Friendly.
If deathism really is what is awesome for all humanity, I expect a FAI to implement deathism. But there’s no particular reason to believe that deathism is what is awesome for humanity.
Tim, your comment highlights the potential conflict between CEV and FAI that I also mentioned previously. FAI is by definition not hostile to human beings, whereas CEV might permit, or even require, the extinction of all humanity. This may happen, for instance, if the process of coherent extrapolation shows that humans value certain superior beings more than they value themselves, and if the coexistence of humans and these beings is impossible.
When I pointed out this problem, both Kaj Sotala and Michael Anissimov replied that CEV can never condone hostile actions towards humanity because FAI is “defined as ‘human-benefiting, non-human harming’”. However, this reply just proves my point, namely that there is a potential internal inconsistency between CEV and FAI.
Don’t look at me to resolve that conflict. I think moral extrapolation is unlikely to output anything coherent if the reference class is sufficiently large to avoid the objections I raised above. And I can’t think of any other plausible candidate to produce Friendly instructions for an AI.