When I read the meta-ethics sequence I mostly wondered why he made it so complicated and convoluted. My own take just seems a lot simpler—which might mean it’s wrong for a simple reason, too. I’m hoping someone can help.
I see ethics as about adopting some set of axioms that define which universes are morally preferable to others, and then reasoning from those axioms to decide whether an action, given the information available, has positive expected utility.
So which axioms should I adopt? Well, one simple, coherent answer is “none”: be entirely nihilist. I would still prefer some universes over others, as I’d still have all my normal non-moral preferences, such as appetites etc. But it’d be all about me, and other people’s interests would only count so far as they were instrumental to my own.
The problem is that the typical human mind has needs that are incompatible with nihilism. Nihilism thus becomes anti-strategic: it’s an unlikely path to happiness. I feel the need to care about other people, and it doesn’t help me to pretend I don’t.[1]
So, nihilism is an anti-strategic ethical system for me to adopt, because it goes against my adapted and culturally learned intuitions about morality—what I’ll call my Emotional Moral Compass. My emotional moral compass defines my knee jerk reactions to what’s right and what’s not. Unfortunately, these knee jerk reactions are hopelessly contradictory. The strength of my emotional reaction to an injustice is heavily influenced by my mood, and can be primed easily. It doesn’t scale properly. It’s dominated by the connection I feel to the people involved, not by what’s happening. And I know that if I took my emotional moral compass back in time, I’d almost certainly get the wrong result to questions that now seem obvious, such as slavery.
I can’t in full reflection agree to define “rightness” with the results of my emotional moral compass, because I also have an emotional need for my beliefs to be internally consistent. I know that my emotional moral compass does not produce consistent judgments. It also does not reliably produce judgments that I would want other people to make. This is problematic because I have a need to believe that I’m the sort of person I would approve of if I were not me.
I really did try on nihilism and discard it, before trying to just follow my emotional moral compass, and discarded that too. Now I’m roughly a preference utilitarian. I’m working on trying to codify my ideas into axioms, but it’s difficult. Should I prefer universes that maximise mean weighted preferences? But then what about population differences? How do I include the future? Is there a discounting rate? The details are surprisingly tricky, which may suggest I’m on the wrong track.
Adopting different ethical axioms hasn’t been an entirely hand-waving sort of gesture. When I was in my “emotional moral compass” stage, I became convinced that a great many animals suffered a great deal in the meat industry. My answer to this was that eating meat still felt costless—I have no real empathy with chickens, cows or pigs, and the magnitude of the problem left me cold (since my EMC can’t do multiplication). I didn’t feel guilty, so my EMC didn’t compel me to do anything differently.
This dissonance got uncomfortable enough that I adopted Peter Singer’s version of preference utilitarianism as an ethical system, and began to act more ethically. I set myself a deadline of six months to become vegetarian, and resolved to tithe to the charity I determined to have maximum utility once I got a post-degree job.
If ethics are based on reasoning from axioms, how do I deal with people who have different axioms from me? Well, one convenient thing is that few people adopt terrible axioms that have them preferring universes paved in paperclips or something. Usually people’s ethics are just inconsistent.
A less convenient universe would present me with someone who had entirely consistent ethics based on completely different axioms that led to different judgments from mine, and maximising the resulting utility function would make the person feel happy and fulfilled. Ethical debate with this person would be fruitless, and I would have to regard them as On the Wrong Team. We want irreconcilably different things. But I couldn’t say I was more “right” than they, except with special reference to my definition of “right” in preference to theirs.
[1] Would I change my psychology so that I could be satisfied with nihilism, instead of preference utilitarianism? No, but I’m making that decision based on my current values. Switching utilitarian-me for nihilist-me would just put another person On the Wrong Team, which is a negative utility move based on my present utility function. I can’t want to not care while currently caring, because my current caring ensures that I care about caring.
There’s also no reason to believe that it would be easier to be satisfied with this alternate psychology. Sure, satisfying ethics requires me to eat partially against my taste preferences, and my material standard of living takes an inconsequential hit. But I gain this whole other dimension of satisfaction. In other words, I get an itch that it costs a lot to scratch, but having scratched it I’m better off. A similar question would be, would I choose to have zero sexual or romantic interest, if I could? I emphatically answer no.
I think your take is pretty much completely correct. You don’t fall into the trap of arguing whether “moral facts are out there” or the trap of quibbling over definitions of “right”, and you very clearly delineate the things you understand from the things you don’t.
Your argument against nihilism is fundamentally “I feel the need to care about other people, and it doesn’t help me to pretend I don’t”.
(I’ll accept for the purpose of this conversation that the empty ethical system deserves to be called “nihilism”. I would have guessed the word had a different meaning, but let’s not quibble over definitions.)
That’s not an argument against nihilism. If I want to eat three meals a day, and I want other people not to starve, and I want my wife and kids to have a good life, that’s all stuff I want. Caring for other people is entirely consistent with nihilism, it’s just another thing you want.
Utiliarianism doesn’t solve the problem of having a bunch of contradictory desires. It just leaves you trying to satisfy other people’s contradictory desires instead of your own. However, I am unfamiliar with Peter Singer’s version. Does it solve this problem?
I think the term nihilism is getting in the way here. Let’s instead talk about “the zero axiom system”. This is where you don’t say that any universes are morally preferable to any others. They may be appetite-preferable, love-for-people-close-to-you preferable, etc.
If no universes are morally preferable, one strategy is to be as ruthlessly self-serving as possible. I predict this would fail to make most people happy, however, because most people have a desire to help others as well as themselves.
So a second strategy is to just “go with the flow” and let yourself give as much as your knee-jerk guilt or sympathy-driven reactions tell you to. You don’t research charities and you still eat meat, but maybe you give to a disaster relief appeal when the people suffering are rich enough or similar enough to you to make you sympathetic.
All I’m really saying is that this second approach is also anti-strategic once you get to a certain level of self-consistency, and desire for further self-consistency becomes strong enough to over-rule desire for some other comforts.
I find myself in a bind where I can’t care nothing, and I can’t just follow my emotional moral compass. I must instead adopt making the world a better place as a top-level goal, and work strategically to make that happen. That requires me to adopt some definition of what constitutes a better universe that isn’t rooted in my self-interest. In other words, my self-interest depends on having goals that don’t themselves refer to my self-interest. And those goals have to do that in entirely good-faith. I can’t fake this, because that contradicts my need for self-consistency.
In other words, I’m saying that someone becomes vegetarian when their need for a consistent self-image about whether they behave morally starts to over-rule the sensory, health and social benefits of eating meat. Someone starts to tithe to charity when their need for moral consistency starts to over-rule their need for an extra 10% of their income.
So you can always do the calculations about why someone did something, and take it back to their self-interest, and what strategies they’re using to achieve that self-interest. Utilitarianism is just the strategy of adopting self-external goals as a way to meet your need for some self-image or guilt-reassurance. But it’s powerful because it’s difficult to fake: if you adopt this goal of making the world a better place, you can then start calculating.
There are some people who see the fact that this is all derivable from self-interest, and think that it means it isn’t moral. They say “well okay, you just have these needs that make you do x, y or z, and those things just happen to help other people. You’re still being selfish!”.
This is just arguing about the meaning of “moral”, and defining it in a way that I believe is actually impossible. What matters is that the people are helped. What matters is the actual outcomes of your actions. If someone doesn’t care what happens to other people at all, they are amoral. If someone cares only enough to give $2 to a backpacker in a koala suit once every six months, they are a very little bit moral. Someone who cares enough to sincerely try to solve problems and gets things done is very moral. What matters is what’s likely to happen.
I can’t interpret your post as a reply to my post. Did you perhaps mean to post it somewhere else?
My fundamental question was, how is a desire to help others fundamentally different from a desire to eat pizza?
You seem to be defining a broken version of the zero ethical system that arbitrarily disregards the former. That’s a strawman.
If you want to say that the zero ethical system is broken, you have to say that something breaks when people try to enact their desires, including the desires to help others.
What matters is that the people are helped.
Sorry, that’s incoherent. Someone is helped if they get things they desire. If your entire set of desires is to help others, then the solution is that your desires (such as eating pizza) don’t matter and theirs do. I don’t think you can really do that. If you can do that, then I hope that few people do that, since somebody has to actually want something for themselves in order for this concept of helping others to make any sense.
(I do believe that this morality-is-selfless statement probably lets you get positive regard from some in-group you desire. Apparently I don’t desire to have that in-group.)
I can’t interpret your post as a reply to my post. Did you perhaps mean to post it somewhere else?
I did intend to reply to you, but I can see I was ineffective. I’ll try harder.
My fundamental question was, how is a desire to help others fundamentally different from a desire to eat pizza?
Fundamentally, it’s not.
You seem to be defining a broken version of the zero ethical system that arbitrarily disregards the former. That’s a strawman.
I’m saying that there’s three versions here:
The strawman where there’s no desire to help others. Does not describe people’s actual desires, but is a self-consistent and coherent approach. It’s just that it wouldn’t work for most people.
Has a desire to help others, but this manifests in behaviour more compatible with guilt-aversion than actually helping people. This is not self-consistent. If the aim is actually guilt-aversion, this collapses back to position 1), because the person must admit to themselves that other people’s desires are only a correlate of what they want (which is to not feel guilty).
Has a desire to help others, and pursues it in good faith, using some definition of which universes are preferable that does not weight their own desires over the desires of others. There’s self-reference here, because the person’s desires do refer to other people’s desires. But you can still maximise the measure even with the self-reference.
If your entire set of desires is to help others, then the solution is that your desires (such as eating pizza) don’t matter and theirs do.
But you do have other desires. You’ve got a desire for pizza, but you’ve also got a desire to help others. So if a 10% income sacrifice meant you get 10% less pizza, but someone else gets 300% more pizza, maybe that works out. But you don’t give up 100% of your income and live in a sack-cloth.
Thanks, I think I understand better. We have some progress here:
We agree that the naive model of a selfish person who doesn’t have any interest in helping others hardly ever describes real people .
We seem to agree that guilt-aversion as a desire doesn’t make sense, but maybe for different reasons.
I think it doesn’t make sense because when I say someone desires X, I mean that they prefer worlds with property X over worlds lacking that property, and I’m only interested in X’s that describe the part of the world outside of their own thought process. For the purposes of figuring out what someone desires, I don’t care if they want it because of guilt aversion or because they’re hungry or some other motive; all I care is that I expect them to make some effort to make it happen, given the opportunity, and taking into account their (perhaps false) model of how the world works.
Maybe I do agree with you enough on this that the difference is unimportant. You said:
If the aim is actually guilt-aversion, this collapses back to position 1), because the
person must admit to themselves that other people’s desires are only a correlate of
what they want (which is to not feel guilty).
I think you’re assuming here that people who claim a desire to help people and are really motivated by guilt-aversion are ineffective. I’m not sure that’s always true. Certainly, if they’re ineffective at helping people due to their own internal process, in practice they don’t really want to help people.
Has a desire to help others, and pursues it in good faith, using some definition of
which universes are preferable that does not weight their own desires over the
desires of others.
I don’t know what it means to “weight their own desires over the desires of others”. If I’m willing to donate a kidney but not donate my only liver, and the potential liver recipient desires to have a better liver, have I weighted my own desires over the desires of others? Maybe you meant “weight their own desires to the exclusion of the desires of others”.
We might disagree about what it means to help others. Personally, I don’t care much about what people want. For example, I have a friend who is alcoholic. He desires alcohol. I care about him and have provided him with room and board in the past when he needed it, but I don’t want him to get alcohol. So my compassion for him is about me wanting to move the world (including him) to the place I want it to go, not some compromise between my desires and his desires.
So I want what I want, and my actions are based on what I want. Some of the things I want give other people some of the things they want. Should it be some other way?
Now a Friendly AI is different. When we’re setting up its utility function, it has no built-in desires of its own, so the only reasonable thing for it to desire is some average of the desires of whoever it’s being Friendly toward. But you and I are human, so we’re not like that—we come into this with our own desires. Let’s not confuse the two and try to act like a machine.
Yes, I think we’re converging onto the interesting disagreements.
I think you’re assuming here that people who claim a desire to help people and are really motivated by guilt-aversion are ineffective. I’m not sure that’s always true. Certainly, if they’re ineffective at helping people due to their own internal process, in practice they don’t really want to help people.
This is largely an empirical point, but I think we differ on it substantially.
I think if people don’t think analytically, and even a little ruthlessly, they’re very ineffective at helping people. The list of failure modes is long. People prefer to help people they can see at the expense of those out of sight who could be helped more cheaply. They’re irrationally intolerant of uncertainty of outcome. They’re not properly sensitive to scale. I haven’t cited these points, but hopefully you agree. If not we can dig a little deeper into them.
I don’t know what it means to “weight their own desires over the desires of others”. If I’m willing to donate a kidney but not donate my only liver, and the potential liver recipient desires to have a better liver, have I weighted my own desires over the desires of others? Maybe you meant “weight their own desires to the exclusion of the desires of others”.
I just meant that self-utility doesn’t get a huge multiplier when compared against others-utility. In the transplant donation example, you get just as much out of your liver as whoever you might give it to. So you’d be going down N utilons and they’d be going up N utilons, and there would be a substantial transaction cost of M utilons. So liver donation wouldn’t be a useful thing to do.
In another example, imagine your organs could save, say, 10 lives. I wouldn’t do that. There are two angles here.
The first is about strategy. You don’t improve the world by being a sucker who can be taken advantage of. You do have to fight your corner, too, otherwise you just promote free-riding. If all the do-gooders get organ harvested, the world is probably not better off.
But even if extremes of altruism were not anti-strategic, I can’t say I’d do them either. There are lots of actions which I would have to admit result in extreme loss of self-utility and extreme gain in net utility that I don’t carry out. These actions are still moral, it’s just that they’re more than I’m willing to do. Some people are excessively uncomfortable about this, and so give up on the idea of trying to be more moral altogether. This is to make the perfect the enemy of the good. Others are uncomfortable about it and try to twist their definition of morality into knots to conform to what they’re willing to do.
The moral ideal is to have a self-utility weight of 1.0: ie, you’re completely impartial to whether the utility is going to you as opposed to someone else. I don’t achieve this, and I don’t expect many other people do either.
But being able to set this selfishness constant isn’t a get-out-of-jail-free card. I have to think about the equation, and how selfish the action would imply I really am. For instance, as an empirical point, I believe that eating meat given the current practices of animal husbandry demands a very high selfishness constant. I can’t reconcile being that selfish with my self-image, and my self-image is more important to me than eating meat. So, vegetarianism, with an attempt to minimise dairy consumption, but not strict veganism, even though veganism is more moral.
We might disagree about what it means to help others. Personally, I don’t care much about what people want. For example, I have a friend who is alcoholic. He desires alcohol. I care about him and have provided him with room and board in the past when he needed it, but I don’t want him to get alcohol. So my compassion for him is about me wanting to move the world (including him) to the place I want it to go, not some compromise between my desires and his desires.
Yes, there are problems with preference utilitarianism. I think some people try to get around the alcoholic example by saying something like “if their desires weren’t being modified by their alcoholism they would want x, and would want you to act as though they wanted x, so those are the true preferences.” As I write this it seems that has to be some kind of strawman, as the idea of some Platonic “true preferences” is quite visibly flawed. There’s no way to distinguish the class of preference-modifiers that includes things like alcoholism from the other preference-modifiers that together constitute a person.
I use preferences because it works well enough most of the time, and I don’t have a good alternate formulation. I don’t actually think the specifics of the metric being maximised are usually that important. I think it would be better to agree on desiderata for the measure—properties that it ought to exhibit.
Anyway. What I’m trying to say is a little clearer to me now. I don’t think the key idea is really about meta-ethics at all. The idea is just that almost everyone follows a biased, heuristic-based strategy for satisfying their moral desires, and that this strategy isn’t actually very productive. It satisfies the heuristics like “I am feeling guilty, which means I need to help someone now”, but it doesn’t scratch the deeper itch to believe you genuinely make a difference very well.
So the idea is just that morality is another area where many people would benefit from deploying rationality. But this one’s counter-intuitive, because it takes a rather cold and ruthless mindset to carry it through.
Okay, I agree that what you want to do works most of the time, and we seem to agree that you don’t have good solution to the alcoholism problem, and we also seem to agree that acting from a mishmash of heuristics without any reflection or attempts to make a rational whole will very likely flounder around uselessly.
Not to imply that our conversation was muddled by the following, but: we can reformulate the alcoholism problem to eliminate the addiction. Suppose my friend heard about that reality show guy who was killed by a stingray and wanted to spend his free time killing stingrays to get revenge. (I heard there are such people, but I have never met one.) I wouldn’t want to help him with that, either.
There’s a strip of an incredibly over-the-top vulgar comic called space moose that gets at the same idea. These acts of kindness aren’t positive utility, even if the utility metric is based on desires, because they conflict with the desires of the stingrays or other victims. Preferences also need to be weighted somehow in preference utilitarianism, I suppose by importance to the person. But then hmm, anyone gets to be a utility monster by just really really really really wanting to kill the stringrays. So yeah there’s a problem there.
I think I need to update, and abandon preference utilitarianism even as a useful correlate of whatever the right measure would be.
While it’s gratifying to win an argument, I’d rather not do it under false pretenses:
But then hmm, anyone gets to be a utility monster by just really really really really
wanting to kill the stringrays.
We need a solution to the utility monster problem if we’re going to have a Friendly AI that cares about people’s desires, so it’s better to solve the utility monster problem than to give up on preference utilitarianism in part because you don’t know how to solve the utility monster problem. I’ve sketched proposed solutions to two types of utility monsters, one that has one entity with large utility and one that has a large number of entities with modest utility. If these putative solutions seem wrong to you, please post bugs, fixes, or alternatives as replies to those comments.
I agree that preference utilitarianism has the problem that it doesn’t free you from choosing how to weight the preferences. It also has the problem that you have to separate yourself into two parts, the part that gets to have its preference included in the weighted sum, and the part that has a preference that is the weighted sum. In reality there’s only one of you, so that distinction is artificial.
Why distinguish between moral and non-moral preferences? Why are moral preferences more mutable than non-moral ones?
The basic drive to adopt some sort of ethical system is essentially the same as other preferences, and is non-mutable. It’s a preference to believe that you are making the world a better place, rather than a worse place. This introduces a definitional question of what constitutes a good world and what constitutes a bad world, which is something I think people can change their minds about.
Having written that, one question that occurs to me now is, is the basic preference to believe that you’re making the world a better place, or is it to simply believe you’re a good person? I prefer people who make the world a better place, so the two produce the same outcomes for me. But other people might not. If you instead had a preference for people who followed good principles or exhibited certain virtues, you wouldn’t feel it necessary to make the world a better place. I shouldn’t assume that such people don’t exist.
So maybe instead of talking adopting a definition of which universes are good and bad, I should talk about adopting a definition of good and bad people. If you define a good person by the consequences of their actions, then you’d go on to define which universes are good and bad. But otherwise you might instead define which principles are good, or which virtues.
When I read the meta-ethics sequence I mostly wondered why he made it so complicated and convoluted. My own take just seems a lot simpler—which might mean it’s wrong for a simple reason, too. I’m hoping someone can help.
I see ethics as about adopting some set of axioms that define which universes are morally preferable to others, and then reasoning from those axioms to decide whether an action, given the information available, has positive expected utility.
So which axioms should I adopt? Well, one simple, coherent answer is “none”: be entirely nihilist. I would still prefer some universes over others, as I’d still have all my normal non-moral preferences, such as appetites etc. But it’d be all about me, and other people’s interests would only count so far as they were instrumental to my own.
The problem is that the typical human mind has needs that are incompatible with nihilism. Nihilism thus becomes anti-strategic: it’s an unlikely path to happiness. I feel the need to care about other people, and it doesn’t help me to pretend I don’t.[1]
So, nihilism is an anti-strategic ethical system for me to adopt, because it goes against my adapted and culturally learned intuitions about morality—what I’ll call my Emotional Moral Compass. My emotional moral compass defines my knee jerk reactions to what’s right and what’s not. Unfortunately, these knee jerk reactions are hopelessly contradictory. The strength of my emotional reaction to an injustice is heavily influenced by my mood, and can be primed easily. It doesn’t scale properly. It’s dominated by the connection I feel to the people involved, not by what’s happening. And I know that if I took my emotional moral compass back in time, I’d almost certainly get the wrong result to questions that now seem obvious, such as slavery.
I can’t in full reflection agree to define “rightness” with the results of my emotional moral compass, because I also have an emotional need for my beliefs to be internally consistent. I know that my emotional moral compass does not produce consistent judgments. It also does not reliably produce judgments that I would want other people to make. This is problematic because I have a need to believe that I’m the sort of person I would approve of if I were not me.
I really did try on nihilism and discard it, before trying to just follow my emotional moral compass, and discarded that too. Now I’m roughly a preference utilitarian. I’m working on trying to codify my ideas into axioms, but it’s difficult. Should I prefer universes that maximise mean weighted preferences? But then what about population differences? How do I include the future? Is there a discounting rate? The details are surprisingly tricky, which may suggest I’m on the wrong track.
Adopting different ethical axioms hasn’t been an entirely hand-waving sort of gesture. When I was in my “emotional moral compass” stage, I became convinced that a great many animals suffered a great deal in the meat industry. My answer to this was that eating meat still felt costless—I have no real empathy with chickens, cows or pigs, and the magnitude of the problem left me cold (since my EMC can’t do multiplication). I didn’t feel guilty, so my EMC didn’t compel me to do anything differently.
This dissonance got uncomfortable enough that I adopted Peter Singer’s version of preference utilitarianism as an ethical system, and began to act more ethically. I set myself a deadline of six months to become vegetarian, and resolved to tithe to the charity I determined to have maximum utility once I got a post-degree job.
If ethics are based on reasoning from axioms, how do I deal with people who have different axioms from me? Well, one convenient thing is that few people adopt terrible axioms that have them preferring universes paved in paperclips or something. Usually people’s ethics are just inconsistent.
A less convenient universe would present me with someone who had entirely consistent ethics based on completely different axioms that led to different judgments from mine, and maximising the resulting utility function would make the person feel happy and fulfilled. Ethical debate with this person would be fruitless, and I would have to regard them as On the Wrong Team. We want irreconcilably different things. But I couldn’t say I was more “right” than they, except with special reference to my definition of “right” in preference to theirs.
[1] Would I change my psychology so that I could be satisfied with nihilism, instead of preference utilitarianism? No, but I’m making that decision based on my current values. Switching utilitarian-me for nihilist-me would just put another person On the Wrong Team, which is a negative utility move based on my present utility function. I can’t want to not care while currently caring, because my current caring ensures that I care about caring.
There’s also no reason to believe that it would be easier to be satisfied with this alternate psychology. Sure, satisfying ethics requires me to eat partially against my taste preferences, and my material standard of living takes an inconsequential hit. But I gain this whole other dimension of satisfaction. In other words, I get an itch that it costs a lot to scratch, but having scratched it I’m better off. A similar question would be, would I choose to have zero sexual or romantic interest, if I could? I emphatically answer no.
I think your take is pretty much completely correct. You don’t fall into the trap of arguing whether “moral facts are out there” or the trap of quibbling over definitions of “right”, and you very clearly delineate the things you understand from the things you don’t.
Isn’t it a bit late for that question for any human, by the time a human can formulate the question?
You don’t really have the option of adopting it, just espousing it (including to yourself). No?
You really could, all else equal, because all the (other) humans have, as you said, very similar axioms rather than terrible ones.
Your argument against nihilism is fundamentally “I feel the need to care about other people, and it doesn’t help me to pretend I don’t”.
(I’ll accept for the purpose of this conversation that the empty ethical system deserves to be called “nihilism”. I would have guessed the word had a different meaning, but let’s not quibble over definitions.)
That’s not an argument against nihilism. If I want to eat three meals a day, and I want other people not to starve, and I want my wife and kids to have a good life, that’s all stuff I want. Caring for other people is entirely consistent with nihilism, it’s just another thing you want.
Utiliarianism doesn’t solve the problem of having a bunch of contradictory desires. It just leaves you trying to satisfy other people’s contradictory desires instead of your own. However, I am unfamiliar with Peter Singer’s version. Does it solve this problem?
I think the term nihilism is getting in the way here. Let’s instead talk about “the zero axiom system”. This is where you don’t say that any universes are morally preferable to any others. They may be appetite-preferable, love-for-people-close-to-you preferable, etc.
If no universes are morally preferable, one strategy is to be as ruthlessly self-serving as possible. I predict this would fail to make most people happy, however, because most people have a desire to help others as well as themselves.
So a second strategy is to just “go with the flow” and let yourself give as much as your knee-jerk guilt or sympathy-driven reactions tell you to. You don’t research charities and you still eat meat, but maybe you give to a disaster relief appeal when the people suffering are rich enough or similar enough to you to make you sympathetic.
All I’m really saying is that this second approach is also anti-strategic once you get to a certain level of self-consistency, and desire for further self-consistency becomes strong enough to over-rule desire for some other comforts.
I find myself in a bind where I can’t care nothing, and I can’t just follow my emotional moral compass. I must instead adopt making the world a better place as a top-level goal, and work strategically to make that happen. That requires me to adopt some definition of what constitutes a better universe that isn’t rooted in my self-interest. In other words, my self-interest depends on having goals that don’t themselves refer to my self-interest. And those goals have to do that in entirely good-faith. I can’t fake this, because that contradicts my need for self-consistency.
In other words, I’m saying that someone becomes vegetarian when their need for a consistent self-image about whether they behave morally starts to over-rule the sensory, health and social benefits of eating meat. Someone starts to tithe to charity when their need for moral consistency starts to over-rule their need for an extra 10% of their income.
So you can always do the calculations about why someone did something, and take it back to their self-interest, and what strategies they’re using to achieve that self-interest. Utilitarianism is just the strategy of adopting self-external goals as a way to meet your need for some self-image or guilt-reassurance. But it’s powerful because it’s difficult to fake: if you adopt this goal of making the world a better place, you can then start calculating.
There are some people who see the fact that this is all derivable from self-interest, and think that it means it isn’t moral. They say “well okay, you just have these needs that make you do x, y or z, and those things just happen to help other people. You’re still being selfish!”.
This is just arguing about the meaning of “moral”, and defining it in a way that I believe is actually impossible. What matters is that the people are helped. What matters is the actual outcomes of your actions. If someone doesn’t care what happens to other people at all, they are amoral. If someone cares only enough to give $2 to a backpacker in a koala suit once every six months, they are a very little bit moral. Someone who cares enough to sincerely try to solve problems and gets things done is very moral. What matters is what’s likely to happen.
I can’t interpret your post as a reply to my post. Did you perhaps mean to post it somewhere else?
My fundamental question was, how is a desire to help others fundamentally different from a desire to eat pizza?
You seem to be defining a broken version of the zero ethical system that arbitrarily disregards the former. That’s a strawman.
If you want to say that the zero ethical system is broken, you have to say that something breaks when people try to enact their desires, including the desires to help others.
Sorry, that’s incoherent. Someone is helped if they get things they desire. If your entire set of desires is to help others, then the solution is that your desires (such as eating pizza) don’t matter and theirs do. I don’t think you can really do that. If you can do that, then I hope that few people do that, since somebody has to actually want something for themselves in order for this concept of helping others to make any sense.
(I do believe that this morality-is-selfless statement probably lets you get positive regard from some in-group you desire. Apparently I don’t desire to have that in-group.)
I did intend to reply to you, but I can see I was ineffective. I’ll try harder.
Fundamentally, it’s not.
I’m saying that there’s three versions here:
The strawman where there’s no desire to help others. Does not describe people’s actual desires, but is a self-consistent and coherent approach. It’s just that it wouldn’t work for most people.
Has a desire to help others, but this manifests in behaviour more compatible with guilt-aversion than actually helping people. This is not self-consistent. If the aim is actually guilt-aversion, this collapses back to position 1), because the person must admit to themselves that other people’s desires are only a correlate of what they want (which is to not feel guilty).
Has a desire to help others, and pursues it in good faith, using some definition of which universes are preferable that does not weight their own desires over the desires of others. There’s self-reference here, because the person’s desires do refer to other people’s desires. But you can still maximise the measure even with the self-reference.
But you do have other desires. You’ve got a desire for pizza, but you’ve also got a desire to help others. So if a 10% income sacrifice meant you get 10% less pizza, but someone else gets 300% more pizza, maybe that works out. But you don’t give up 100% of your income and live in a sack-cloth.
Thanks, I think I understand better. We have some progress here:
We agree that the naive model of a selfish person who doesn’t have any interest in helping others hardly ever describes real people .
We seem to agree that guilt-aversion as a desire doesn’t make sense, but maybe for different reasons.
I think it doesn’t make sense because when I say someone desires X, I mean that they prefer worlds with property X over worlds lacking that property, and I’m only interested in X’s that describe the part of the world outside of their own thought process. For the purposes of figuring out what someone desires, I don’t care if they want it because of guilt aversion or because they’re hungry or some other motive; all I care is that I expect them to make some effort to make it happen, given the opportunity, and taking into account their (perhaps false) model of how the world works.
Maybe I do agree with you enough on this that the difference is unimportant. You said:
I think you’re assuming here that people who claim a desire to help people and are really motivated by guilt-aversion are ineffective. I’m not sure that’s always true. Certainly, if they’re ineffective at helping people due to their own internal process, in practice they don’t really want to help people.
I don’t know what it means to “weight their own desires over the desires of others”. If I’m willing to donate a kidney but not donate my only liver, and the potential liver recipient desires to have a better liver, have I weighted my own desires over the desires of others? Maybe you meant “weight their own desires to the exclusion of the desires of others”.
We might disagree about what it means to help others. Personally, I don’t care much about what people want. For example, I have a friend who is alcoholic. He desires alcohol. I care about him and have provided him with room and board in the past when he needed it, but I don’t want him to get alcohol. So my compassion for him is about me wanting to move the world (including him) to the place I want it to go, not some compromise between my desires and his desires.
So I want what I want, and my actions are based on what I want. Some of the things I want give other people some of the things they want. Should it be some other way?
Now a Friendly AI is different. When we’re setting up its utility function, it has no built-in desires of its own, so the only reasonable thing for it to desire is some average of the desires of whoever it’s being Friendly toward. But you and I are human, so we’re not like that—we come into this with our own desires. Let’s not confuse the two and try to act like a machine.
Yes, I think we’re converging onto the interesting disagreements.
This is largely an empirical point, but I think we differ on it substantially.
I think if people don’t think analytically, and even a little ruthlessly, they’re very ineffective at helping people. The list of failure modes is long. People prefer to help people they can see at the expense of those out of sight who could be helped more cheaply. They’re irrationally intolerant of uncertainty of outcome. They’re not properly sensitive to scale. I haven’t cited these points, but hopefully you agree. If not we can dig a little deeper into them.
I just meant that self-utility doesn’t get a huge multiplier when compared against others-utility. In the transplant donation example, you get just as much out of your liver as whoever you might give it to. So you’d be going down N utilons and they’d be going up N utilons, and there would be a substantial transaction cost of M utilons. So liver donation wouldn’t be a useful thing to do.
In another example, imagine your organs could save, say, 10 lives. I wouldn’t do that. There are two angles here.
The first is about strategy. You don’t improve the world by being a sucker who can be taken advantage of. You do have to fight your corner, too, otherwise you just promote free-riding. If all the do-gooders get organ harvested, the world is probably not better off.
But even if extremes of altruism were not anti-strategic, I can’t say I’d do them either. There are lots of actions which I would have to admit result in extreme loss of self-utility and extreme gain in net utility that I don’t carry out. These actions are still moral, it’s just that they’re more than I’m willing to do. Some people are excessively uncomfortable about this, and so give up on the idea of trying to be more moral altogether. This is to make the perfect the enemy of the good. Others are uncomfortable about it and try to twist their definition of morality into knots to conform to what they’re willing to do.
The moral ideal is to have a self-utility weight of 1.0: ie, you’re completely impartial to whether the utility is going to you as opposed to someone else. I don’t achieve this, and I don’t expect many other people do either.
But being able to set this selfishness constant isn’t a get-out-of-jail-free card. I have to think about the equation, and how selfish the action would imply I really am. For instance, as an empirical point, I believe that eating meat given the current practices of animal husbandry demands a very high selfishness constant. I can’t reconcile being that selfish with my self-image, and my self-image is more important to me than eating meat. So, vegetarianism, with an attempt to minimise dairy consumption, but not strict veganism, even though veganism is more moral.
Yes, there are problems with preference utilitarianism. I think some people try to get around the alcoholic example by saying something like “if their desires weren’t being modified by their alcoholism they would want x, and would want you to act as though they wanted x, so those are the true preferences.” As I write this it seems that has to be some kind of strawman, as the idea of some Platonic “true preferences” is quite visibly flawed. There’s no way to distinguish the class of preference-modifiers that includes things like alcoholism from the other preference-modifiers that together constitute a person.
I use preferences because it works well enough most of the time, and I don’t have a good alternate formulation. I don’t actually think the specifics of the metric being maximised are usually that important. I think it would be better to agree on desiderata for the measure—properties that it ought to exhibit.
Anyway. What I’m trying to say is a little clearer to me now. I don’t think the key idea is really about meta-ethics at all. The idea is just that almost everyone follows a biased, heuristic-based strategy for satisfying their moral desires, and that this strategy isn’t actually very productive. It satisfies the heuristics like “I am feeling guilty, which means I need to help someone now”, but it doesn’t scratch the deeper itch to believe you genuinely make a difference very well.
So the idea is just that morality is another area where many people would benefit from deploying rationality. But this one’s counter-intuitive, because it takes a rather cold and ruthless mindset to carry it through.
Okay, I agree that what you want to do works most of the time, and we seem to agree that you don’t have good solution to the alcoholism problem, and we also seem to agree that acting from a mishmash of heuristics without any reflection or attempts to make a rational whole will very likely flounder around uselessly.
Not to imply that our conversation was muddled by the following, but: we can reformulate the alcoholism problem to eliminate the addiction. Suppose my friend heard about that reality show guy who was killed by a stingray and wanted to spend his free time killing stingrays to get revenge. (I heard there are such people, but I have never met one.) I wouldn’t want to help him with that, either.
There’s a strip of an incredibly over-the-top vulgar comic called space moose that gets at the same idea. These acts of kindness aren’t positive utility, even if the utility metric is based on desires, because they conflict with the desires of the stingrays or other victims. Preferences also need to be weighted somehow in preference utilitarianism, I suppose by importance to the person. But then hmm, anyone gets to be a utility monster by just really really really really wanting to kill the stringrays. So yeah there’s a problem there.
I think I need to update, and abandon preference utilitarianism even as a useful correlate of whatever the right measure would be.
While it’s gratifying to win an argument, I’d rather not do it under false pretenses:
We need a solution to the utility monster problem if we’re going to have a Friendly AI that cares about people’s desires, so it’s better to solve the utility monster problem than to give up on preference utilitarianism in part because you don’t know how to solve the utility monster problem. I’ve sketched proposed solutions to two types of utility monsters, one that has one entity with large utility and one that has a large number of entities with modest utility. If these putative solutions seem wrong to you, please post bugs, fixes, or alternatives as replies to those comments.
I agree that preference utilitarianism has the problem that it doesn’t free you from choosing how to weight the preferences. It also has the problem that you have to separate yourself into two parts, the part that gets to have its preference included in the weighted sum, and the part that has a preference that is the weighted sum. In reality there’s only one of you, so that distinction is artificial.
Why distinguish between moral and non-moral preferences? Why are moral preferences more mutable than non-moral ones?
Also, a lot of this applies to your specific situation, so it is more morality than metaethics.
The basic drive to adopt some sort of ethical system is essentially the same as other preferences, and is non-mutable. It’s a preference to believe that you are making the world a better place, rather than a worse place. This introduces a definitional question of what constitutes a good world and what constitutes a bad world, which is something I think people can change their minds about.
Having written that, one question that occurs to me now is, is the basic preference to believe that you’re making the world a better place, or is it to simply believe you’re a good person? I prefer people who make the world a better place, so the two produce the same outcomes for me. But other people might not. If you instead had a preference for people who followed good principles or exhibited certain virtues, you wouldn’t feel it necessary to make the world a better place. I shouldn’t assume that such people don’t exist.
So maybe instead of talking adopting a definition of which universes are good and bad, I should talk about adopting a definition of good and bad people. If you define a good person by the consequences of their actions, then you’d go on to define which universes are good and bad. But otherwise you might instead define which principles are good, or which virtues.