One idea that I have been toying since I read Eliezer’s various posts on the complexity of value is that the best moral system might not turn out to be about maximizing satisfaction of any and all preferences, regardless of what those preferences are. Rather, it would be about increasing the satisfaction of various complex, positive human values, such as i.e. “Life, consciousness, and activity; health and strength; pleasures and satisfactions of all or certain kinds; happiness, beatitude, contentment, etc.” If this is the case then it may well be that horribly malevolent preferences, such as those the Nazis in this thought experiment exhibit, are simply not the sort of preferences that it is morally good to satisfy. Obviously judging which values are “worthy” to be increased is a difficult problem that creates huge moral hazards for whomever is doing the judging, but that is an implementation problem, not a problem with the general principle.
If this line of reasoning is correct, then the reason that preference utilitarianism seems so intuitively persuasive is that since most of our moral reasoning deals with humans, most of the time maximizing whatever a human being prefers is pretty much guaranteed to achieve a great many of those human values. For this reason, “human values utilitarianism” and preference utilitarianism generate the same answers in most real-world scenarios. However, there might be a few values that human beings are theoretically capable of having, but aren’t morally good to maximize. I think one of these morally bad preferences is sheer total malevolence, where you hate someone and want to hurt them as an end in itself.
This theory would also explain why people feel that it would be a bad thing if an AI were to kill/sterilize all the human beings in existence and replace them with creatures whose preferences are easier to satisfy. Such action would result in increased preference satisfaction, but they’d be the wrong kind of preferences, they wouldn’t be positive human values. (Please note that though I refer to these as “human values,” I am not advocating specieism. A nonhuman creature who had similar values would be just as morally significant as a human)
This gets a little more complicated if we change the thought experiment slightly and assume that the Nazi’s Jew-hatred is ego-dystonic rather than ego-syntonic. That is, the conscious rational, “approving” part of their brain doesn’t want to hurt Jews, but the subconscious “liking” parts of their brains feel incredible pain and psychological distress from knowing that Jews exist. We assume that this part of their brain cannot be changed.
If this is the case then the Nazis are not attempting to satisfy some immoral, malevolent preference. They are simply trying to satisfy the common human preference to not feel psychological pain. Killing the Jews to save the Nazis from such agony would be equivalent to killing a small group of people to save a larger group from being horribly tortured. I don’t think the fact that the agent doing the torturing is the Nazis’ own subconscious mind, instead of an outside agent, is important.
However, since the Nazis’ preference is “I don’t want my subconscious to hurt me because it perceives Jews,” rather than “I want to make the statement ‘all Jews are dead’ true” there is an obvious solution to this dilemma. Trick the Nazis into thinking the Jews are dead without actually killing them. That would remove their psychological torment while preserving the lives of the Jews. It would not create the usual moral dilemmas associated with deception because in this variant of the thought experiment the Nazi’s preference isn’t to kill the Jews, it’s to not feel pain from believing Jews exist.
Of course, if you have the option of lying, the problem becomes trivial and uninteresting, regardless of your model of the Nazi psyche. It’s when your choice requires to improve the life of one group at the expense of another one suffering, you tend to face a repugnant conclusion.
Of course, if you have the option of lying, the problem becomes trivial and uninteresting, regardless of your model of the Nazi psyche.
In the original framing of the thought experiment the reason lying wasn’t an option was because the Nazis didn’t want to believe that all the Jews were dead, they wanted the Jews to really be dead. So if you lied to them you wouldn’t really be improving their lives because they wouldn’t really be getting what they wanted.
By contrast, if the Nazis simply feel intense emotional pain at the knowledge that Jews exist, and killing Jews is an instrumental goal towards preventing that pain, then lying is the best option.
You’re right that that makes the problem trivial. The reason I addressed it at all was that my original thesis was “satisfying malicious preferences is not moral.” I was afraid someone might challenge this by emphasizing the psychological pain and distress the Nazis might feel. However, if that is the case then the problem changes from “Is it good to kill people to satisfy a malicious preference?” to “Is it good to kill people to prevent psychological pain and distress.
I still think that “malicious preferences are morally worthless” is a good possible solution to this problem, providing one has a sufficiently rigorous definition of “malicious.”
In the original framing of the thought experiment the reason lying wasn’t an option was because the Nazis didn’t want to believe that all the Jews were dead, they wanted the Jews to really be dead. So if you lied to them you wouldn’t really be improving their lives because they wouldn’t really be getting what they wanted.
Maybe you misunderstand the concept of lying. They would really believe that all Jews are dead if successfully lied to, so their stress would decrease just as much as as if they all were indeed dead.
I still think that “malicious preferences are morally worthless” is a good possible solution to this problem, providing one has a sufficiently rigorous definition of “malicious.”
This is more interesting. Here we go, the definitions:
Assumption: we assume that it is possible to separate overall personal happiness level into components (factors), which could be additive, multiplicative (or separable in some other way). This does not seem overly restrictive.
Definition 1: A component of personal happiness resulting from others being unhappy is called “malicious”.
Definition 2: A component of personal happiness resulting from others being happy is called “virtuous”.
Definition 3: A component of personal happiness that is neither malicious nor virtuous is called “neutral”.
Now your suggestion is that malicious components do not count toward global decision making at all. (Virtuous components possibly count more than neutral ones, though this could already be accounted for.) Thus we ignore any suffering inflicted on Nazis due to Jews existing/prospering.
They would really believe that all Jews are dead if successfully lied to, so their stress would decrease just as much as as if they all were indeed dead.
If this is the case then the Nazis do not really want to kill the Jews. What they really want to do is decrease their stress, killing Jews is just an instrumental goal to achieve that end. My understanding of the original thought experiment was that killing Jews was a terminal value for the Nazis, something they valued for its own sake regardless of whether it helped them achieve any other goals. In other words, even if you were able to modify the Nazi brains so they didn’t feel stress at the knowledge that Jews existed, they would still desire to kill them.
Does this sound right?
Yes, that’s exactly the point I was trying to make, although I prefer the term “personal satisfaction” rather than “personal happiness” to reflect the possibility that there are other values then happiness.
One idea that I have been toying since I read Eliezer’s various posts on the complexity of value is that the best moral system might not turn out to be about maximizing satisfaction of any and all preferences, regardless of what those preferences are. Rather, it would be about increasing the satisfaction of various complex, positive human values, such as i.e. “Life, consciousness, and activity; health and strength; pleasures and satisfactions of all or certain kinds; happiness, beatitude, contentment, etc.” If this is the case then it may well be that horribly malevolent preferences, such as those the Nazis in this thought experiment exhibit, are simply not the sort of preferences that it is morally good to satisfy. Obviously judging which values are “worthy” to be increased is a difficult problem that creates huge moral hazards for whomever is doing the judging, but that is an implementation problem, not a problem with the general principle.
If this line of reasoning is correct, then the reason that preference utilitarianism seems so intuitively persuasive is that since most of our moral reasoning deals with humans, most of the time maximizing whatever a human being prefers is pretty much guaranteed to achieve a great many of those human values. For this reason, “human values utilitarianism” and preference utilitarianism generate the same answers in most real-world scenarios. However, there might be a few values that human beings are theoretically capable of having, but aren’t morally good to maximize. I think one of these morally bad preferences is sheer total malevolence, where you hate someone and want to hurt them as an end in itself.
This theory would also explain why people feel that it would be a bad thing if an AI were to kill/sterilize all the human beings in existence and replace them with creatures whose preferences are easier to satisfy. Such action would result in increased preference satisfaction, but they’d be the wrong kind of preferences, they wouldn’t be positive human values. (Please note that though I refer to these as “human values,” I am not advocating specieism. A nonhuman creature who had similar values would be just as morally significant as a human)
This gets a little more complicated if we change the thought experiment slightly and assume that the Nazi’s Jew-hatred is ego-dystonic rather than ego-syntonic. That is, the conscious rational, “approving” part of their brain doesn’t want to hurt Jews, but the subconscious “liking” parts of their brains feel incredible pain and psychological distress from knowing that Jews exist. We assume that this part of their brain cannot be changed.
If this is the case then the Nazis are not attempting to satisfy some immoral, malevolent preference. They are simply trying to satisfy the common human preference to not feel psychological pain. Killing the Jews to save the Nazis from such agony would be equivalent to killing a small group of people to save a larger group from being horribly tortured. I don’t think the fact that the agent doing the torturing is the Nazis’ own subconscious mind, instead of an outside agent, is important.
However, since the Nazis’ preference is “I don’t want my subconscious to hurt me because it perceives Jews,” rather than “I want to make the statement ‘all Jews are dead’ true” there is an obvious solution to this dilemma. Trick the Nazis into thinking the Jews are dead without actually killing them. That would remove their psychological torment while preserving the lives of the Jews. It would not create the usual moral dilemmas associated with deception because in this variant of the thought experiment the Nazi’s preference isn’t to kill the Jews, it’s to not feel pain from believing Jews exist.
Of course, if you have the option of lying, the problem becomes trivial and uninteresting, regardless of your model of the Nazi psyche. It’s when your choice requires to improve the life of one group at the expense of another one suffering, you tend to face a repugnant conclusion.
In the original framing of the thought experiment the reason lying wasn’t an option was because the Nazis didn’t want to believe that all the Jews were dead, they wanted the Jews to really be dead. So if you lied to them you wouldn’t really be improving their lives because they wouldn’t really be getting what they wanted.
By contrast, if the Nazis simply feel intense emotional pain at the knowledge that Jews exist, and killing Jews is an instrumental goal towards preventing that pain, then lying is the best option.
You’re right that that makes the problem trivial. The reason I addressed it at all was that my original thesis was “satisfying malicious preferences is not moral.” I was afraid someone might challenge this by emphasizing the psychological pain and distress the Nazis might feel. However, if that is the case then the problem changes from “Is it good to kill people to satisfy a malicious preference?” to “Is it good to kill people to prevent psychological pain and distress.
I still think that “malicious preferences are morally worthless” is a good possible solution to this problem, providing one has a sufficiently rigorous definition of “malicious.”
Maybe you misunderstand the concept of lying. They would really believe that all Jews are dead if successfully lied to, so their stress would decrease just as much as as if they all were indeed dead.
This is more interesting. Here we go, the definitions:
Assumption: we assume that it is possible to separate overall personal happiness level into components (factors), which could be additive, multiplicative (or separable in some other way). This does not seem overly restrictive.
Definition 1: A component of personal happiness resulting from others being unhappy is called “malicious”.
Definition 2: A component of personal happiness resulting from others being happy is called “virtuous”.
Definition 3: A component of personal happiness that is neither malicious nor virtuous is called “neutral”.
Now your suggestion is that malicious components do not count toward global decision making at all. (Virtuous components possibly count more than neutral ones, though this could already be accounted for.) Thus we ignore any suffering inflicted on Nazis due to Jews existing/prospering.
Does this sound right?
If this is the case then the Nazis do not really want to kill the Jews. What they really want to do is decrease their stress, killing Jews is just an instrumental goal to achieve that end. My understanding of the original thought experiment was that killing Jews was a terminal value for the Nazis, something they valued for its own sake regardless of whether it helped them achieve any other goals. In other words, even if you were able to modify the Nazi brains so they didn’t feel stress at the knowledge that Jews existed, they would still desire to kill them.
Yes, that’s exactly the point I was trying to make, although I prefer the term “personal satisfaction” rather than “personal happiness” to reflect the possibility that there are other values then happiness.