Good explanation. Thank you. I think remaining disagreement might boil down to semantics. But what exactly is the categorical difference between paper clip maximizers, and power maximizers or pain maximizers? Clippy seems to be an intelligent agent with intentions and values, what ingredient is missing from evil pie?
I suppose I think of the missing ingredients like this:
If a Paperclipper has certain non-paperclip-related underlying desires, believes in paperclip maximization as an ideal and sometimes has to consciously override those baser desires in order to pursue it, and judges other agents negatively for not sharing this ideal, then I would say its morality is badly miscalibrated or malfunctioning. If it was built from a design characterized by a base desire to maximize paperclips combined with a higher-level value-acquisition mechanism that normally overrides this desire with more pro-social values, but somehow this Paperclipper unit fails to do so and therefore falls back on that instinctive drive, then I would say its morality mechanism is disabled. I could describe either as “evil”. (The former is comparable to a genocidal dictator who sincerely believes in the goodness of their actions. The latter is comparable to a sociopath, who has no emotional understanding of morality despite belonging to a class of beings who mostly do and are expected to.)
But, as I understand it, neither of those is the conventional description of Clippy. We tend to use “values” as a shortcut for referring to whatever drives some powerful optimization process, but to avoid anthropomorphism, we should distinguish between moral values — the kind we humans are used to: values associated with emotions, values that we judge others for not sharing, values we can violate and then feel guilty about violating — and utility-function values, which just are. I’ve never seen it implied that Clippy feels happy about creating paperclips, or sad when something gets in the way, or that it cares how other people feel about its actions, or that it judges other agents for not caring about paperclips, or that it judges itself if it strays from its goal (or that it even could choose to stray from its goal). Those differences suggest to me that there’s nothing in its nature enough like morality to be immoral.
I think it comes down to the same ‘accepting him as a person’ thing that Kevin was talking about. My position is that if it talks like a person and generally interacts like a person then it is a person. People can be evil. This clippy is an evil person.
(That said, I don’t usually have much time for using labels like ‘evil’ except for illustrative purposes. ‘Evil’ is mostly a symbol used to make other people do what we want, after all.)
Good explanation. Thank you. I think remaining disagreement might boil down to semantics. But what exactly is the categorical difference between paper clip maximizers, and power maximizers or pain maximizers? Clippy seems to be an intelligent agent with intentions and values, what ingredient is missing from evil pie?
I suppose I think of the missing ingredients like this:
If a Paperclipper has certain non-paperclip-related underlying desires, believes in paperclip maximization as an ideal and sometimes has to consciously override those baser desires in order to pursue it, and judges other agents negatively for not sharing this ideal, then I would say its morality is badly miscalibrated or malfunctioning. If it was built from a design characterized by a base desire to maximize paperclips combined with a higher-level value-acquisition mechanism that normally overrides this desire with more pro-social values, but somehow this Paperclipper unit fails to do so and therefore falls back on that instinctive drive, then I would say its morality mechanism is disabled. I could describe either as “evil”. (The former is comparable to a genocidal dictator who sincerely believes in the goodness of their actions. The latter is comparable to a sociopath, who has no emotional understanding of morality despite belonging to a class of beings who mostly do and are expected to.)
But, as I understand it, neither of those is the conventional description of Clippy. We tend to use “values” as a shortcut for referring to whatever drives some powerful optimization process, but to avoid anthropomorphism, we should distinguish between moral values — the kind we humans are used to: values associated with emotions, values that we judge others for not sharing, values we can violate and then feel guilty about violating — and utility-function values, which just are. I’ve never seen it implied that Clippy feels happy about creating paperclips, or sad when something gets in the way, or that it cares how other people feel about its actions, or that it judges other agents for not caring about paperclips, or that it judges itself if it strays from its goal (or that it even could choose to stray from its goal). Those differences suggest to me that there’s nothing in its nature enough like morality to be immoral.
I think it comes down to the same ‘accepting him as a person’ thing that Kevin was talking about. My position is that if it talks like a person and generally interacts like a person then it is a person. People can be evil. This clippy is an evil person.
(That said, I don’t usually have much time for using labels like ‘evil’ except for illustrative purposes. ‘Evil’ is mostly a symbol used to make other people do what we want, after all.)