Hm, I’ve noticed before that the term ‘Friendly’ is sort of vague. What would I call an AI that optimizes strictly for my goals (and if I care about others’ goals, so be it)? A Will-AI? I’ve said a few times ‘your Friendly is not my Friendly’ but I think I was just redefining Friendliness in an incorrect way that Eliezer wouldn’t endorse.
What would I call an AI that optimizes strictly for my goals...A Will-AI?
One could say “Friendly towards Will.”
But the problem of nailing down your goals seems to me much harder than the problem of negotiating goals between different people. Thus I don’t see a problem of being vague about the target of Friendliness.
But the problem of nailing down your goals seems to me much harder than the problem of negotiating goals between different people. Thus I don’t see a problem of being vague about the target of Friendliness.
Agreed. And asking the question of what is preference of a specific person, represented in some formal language, seems to be a natural simplification of the problem statement, something that needs to be understood before the problem of preference aggregation can be approached.
but I think I was just redefining Friendliness in an incorrect way that Eliezer wouldn’t endorse.
Beware of the urge to censor thoughts that disagree with authority. I personally agree that there is a serious issue here—the issue of moral antirealism, which implies that there is no “canonical human notion of goodness”, so the terminology “Friendly AI” is actually somewhat misleading, and it might be better to say “average human extrapolated morality AGI” when that’s what we want to talk about, e.g.
“an average human extrapolated morality AGI would oppose a paperclip maximizer”.
Then it sounds less onerous to say that you disagree with what an average human extrapolated morality AGI would do than that you disagree with what a “Friendly AI” would do, because most people on this forum disagree with averaged-out human morality (for example, the average human is a theist). Contrast:
“What, you disagree with the FAI? Are you a bad guy then?”
“Friendly AI” is about as specific/ambiguous as “morality”—something humans mostly have in common, allowing for normal variation, not referring to details about specific people. As with preference (morality) of specific people, we can speak of FAI optimizing the world to preference of specific people. Naturally, for each given person it’s preferable to launch a personal-FAI to a consensus-FAI.
Hm, I’ve noticed before that the term ‘Friendly’ is sort of vague. What would I call an AI that optimizes strictly for my goals (and if I care about others’ goals, so be it)? A Will-AI? I’ve said a few times ‘your Friendly is not my Friendly’ but I think I was just redefining Friendliness in an incorrect way that Eliezer wouldn’t endorse.
One could say “Friendly towards Will.”
But the problem of nailing down your goals seems to me much harder than the problem of negotiating goals between different people. Thus I don’t see a problem of being vague about the target of Friendliness.
Agreed. And asking the question of what is preference of a specific person, represented in some formal language, seems to be a natural simplification of the problem statement, something that needs to be understood before the problem of preference aggregation can be approached.
Beware of the urge to censor thoughts that disagree with authority. I personally agree that there is a serious issue here—the issue of moral antirealism, which implies that there is no “canonical human notion of goodness”, so the terminology “Friendly AI” is actually somewhat misleading, and it might be better to say “average human extrapolated morality AGI” when that’s what we want to talk about, e.g.
Then it sounds less onerous to say that you disagree with what an average human extrapolated morality AGI would do than that you disagree with what a “Friendly AI” would do, because most people on this forum disagree with averaged-out human morality (for example, the average human is a theist). Contrast:
“Friendly AI” is about as specific/ambiguous as “morality”—something humans mostly have in common, allowing for normal variation, not referring to details about specific people. As with preference (morality) of specific people, we can speak of FAI optimizing the world to preference of specific people. Naturally, for each given person it’s preferable to launch a personal-FAI to a consensus-FAI.