I’ve read the metaethics sequence twice and am still unclear on what the basic points it’s trying to get across are. (I read it and get to the end and wonder where the “there” is there. What I got from it is “our morality is what we evolved, and humans are all we have therefore it is fundamentally good and therefore it deserves to control the entire future”, which sounds silly when I put it like that.) Would anyone dare summarise it?
Morality is good because goals like joy and beauty are good. (For qualifications, see Appendices A through OmegaOne.) This seems like a tautology, meaning that if we figure out the definition of morality it will contain a list of “good” goals like those. We evolved to care about goodness because of events that could easily have turned out differently, in which case “we” would care about some other list. But, and here it gets tricky, our Good function says we shouldn’t care about that other list. The function does not recognize evolutionary causes as reason to care. In fact, it does not contain any representation of itself. This is a feature. We want the future to contain joy, beauty, etc, not just ‘whatever humans want at the time,’ because an AI or similar genie could and probably would change what we want if we told it to produce the latter.
Okay, now this definitely sounds like standard moral relativism to me. It’s just got the caveat that obviously we endorse our own version of morality, and that’s the ground on which we make our moral judgements. Which is known as appraiser relativism.
I must confess I do not understand what you just said at all. Specifically:
the second sentence: could you please expand on that?
I think I get that the function does not evaluate itself at all, and if you ask it just says “it’s just good ’cos it is, all right?”
Why is this a feature? (I suspect the password is “Löb’s theorem”, and only almost understand why.)
The last bit appears to be what I meant by “therefore it deserves to control the entire future.” It strikes me as insufficient reason to conclude that this can in no way be improved, ever.
Does the sequence show a map of how to build metamorality from the ground up, much as writing the friendly AI will need to work from the ground up?
the second sentence: could you please expand on that?
I’ll try: any claim that a fundamental/terminal moral goal ‘is good’ reduces to a tautology on this view, because “good” doesn’t have anything to it besides these goals. The speaker’s definition of goodness makes every true claim of this kind true by definition. (Though the more practical statements involve inference. I started to say it must be all logical inference, realized EY could not possibly have said that, and confirmed that in fact he did not.)
I get that the function does not evaluate itself at all,
Though technically it may see the act of caring about goodness as good. So I have to qualify what I said before that way.
Why is this a feature?
Because if the function could look at the mechanical, causal steps it takes, and declare them perfectly reliable, it would lead to a flat self-contradiction by Lob’s Theorem. The other way looks like a contradiction but isn’t. (We think.)
Though technically it may see the act of caring about goodness as good. So I have to qualify what I said before that way.
Ooh yeah, didn’t spot that one. (As someone who spent a lot of time when younger thinking about this and trying to be a good person, I certainly should have spotted this.)
This comment by Richard Chappell explained clearly and concisely Eliezer’s metaethical views. It was very highly upvoted, so apparently the collective wisdom of the community considered it accurate. It didn’t receive an explicit endorsement by Eliezer, though.
(namely, whatever terminal values the speaker happens to hold, on some appropriate [if somewhat mysterious] idealization).
(i) ‘Right’ means, roughly, ‘promotes external goods X, Y and Z’
(ii) claim i above is true because I desire X, Y, and Z.
People really think EY is saying this? It looks to me like a basic Egoist stance, where “your values” also include your moral preferences. That is my position, but I don’t think EY is on board.
“Shut up and multiply” implies a symmetry in value between different people that isn’t implied by the above. Similarly, the diversion into mathematical idealization seemed like a maneuver toward Objective Morality—One Algorithm to Bind Them, One Algorithm to Rule them All. Everyone gets their own algorithm as the standard of right and wrong? Fantastic, if it were true, but that’s not how I read EY.
It’s strange, because Richard seems to say that EY agrees with me, while I think EY agrees with him.
I think you are mixing up object-level ethics and metaethics here. You seem to be contrasting an Egoist position (“everyone should do what they want”) with an impersonal utilitarian one (“everyone should do what is good for everyone, shutting up and multiplying”). But the dispute is about what “should”, “right” and related words mean, not about what should be done.
Eliezer (in Richard’s interpretation) says that when someone says “Action A is right” (or “should be done”), the meaning of this is roughly “A promotes ultimate goals XYZ”. Here XYZ is in fact the outcome of a complicated computation based from of the speaker’s state of mind, which can be translated roughly as “the speaker’s terminal values” (for example, for a sincere philanthropist XYZ might be “everyone gets joy, happiness, freedom, etc”). But the fact that XYZ are the speaker’s terminal values is not part of the meaning of “right”, so it is not inconsistent for someone to say “Everyone should promote XYZ, even if they don’t want it” (e.g. “Babyeaters should not eat babies”). And needless to say, XYZ might include generalized utilitarian values like “everyone gets their preferences satisfied”, in which case impersonal, shut-up-and-multiply utilitarianism is what is needed to make actual decisions for concrete cases.
But the dispute is about what “should”, “right” and related words mean, not about what should be done.
Of course it’s about both. You can define labels in any way you like. In the end, your definition better be useful for communicating concepts with other people, or it’s not a good definition.
Let’s define “yummy”. I put food in my mouth. Taste buds fire, neural impulses propagate fro neuron to neuron, and eventually my mind evaluates how yummy it is. Similar events happen for you. Your taste buds fire, your neural impulses propagate, and your mind evaluates how yummy it is. Your taste buds are not mine, and your neural networks are not mine, so your response and my response are not identical. If I make a definition of “yummy” that entails that what you find yummy is not in fact yummy, I’ve created a definition that is useless for dealing with the reality of what you find yummy.
From my inside view of yummy, of course you’re just wrong if you think root beer isn’t yummy—I taste root beer, and it is yummy. But being a conceptual creature, I have more than the inside view, I have an outside view as well, of you, and him, and her, and ultimately of me too. So when I talk about yummy with other people, I recognize that their inside view is not identical to mine, and so use a definition based on the outside view, so that we can actually be talking about the same thing, instead of throwing our differing inside views at each other.
Discussion with the inside view: “Let’s get root beer.” “What? Root beer sucks!” “Root beer is yummy!” “Is not!” “Is too!”
Discussion with the outside view: “Let’s get root beer.” “What? Root beer sucks!” “You don’t find root beer yummy?” “No. Blech.” “OK, I’m getting a root beer.” “And I pick pepsi.”
If you’ve tied yourself up in conceptual knots, and concluded that root beer really isn’t yummy for me, even though my yummy detector fires whenever I have root beer, you’re just confused and not talking about reality.
But the fact that XYZ are the speaker’s terminal values is not part of the meaning of “right”
This is the problem. You’ve divorced your definition from the relevant part of reality—the speaker’s terminal values, and somehow twisted it around to where what he *should” do is at odds with his terminal values. This definition is not useful for discussing moral issues with the given speaker. He’s a machine that maximizes his terminal values. If his algorithms are functioning properly, he’ll disregard your definition as irrelevant to achieving his ends. Whether from the inside view of morality for that speaker, or his outside view, you’re just wrong. And you’re also wrong from any outside view that accurately models what terminal values people actually have.
Rational discussions of morality start with the observation that people have differing terminal values. Our terminal values are our ultimate biases. Recognizing that my biases are mine, and not identical to yours, is the first step away from the usual useless babble in moral philosophy.
Shifting the lump under the rug but not getting rid of it is how it looks to me too. But I don’t understand the rest of that comment and will need to think harder about it (when I’m less sleep-deprived).
I note that that’s the comment Lukeprog flagged as his favourite answer, but of course I can’t tell if it got the upvotes before or after he did so.
Something is green if it emits or scatters much more light between 520 and 570 nm than between 400 and 520 nm or between 570 and 700 nm. That’s what greenmeans, and it also applies to places where there are no humans: it still makes sense to ask whether the skin of tyrannosaurs was green even though there were no humans back then. On the other hand, the reason why we find the concept of ‘something which emits or scatters much more light between 520 and 570 nm than between 400 and 520 nm or between 570 and 700 nm’ important enough to have a word (green) for it is that for evolutionary reasons we have cone cells which work in those ranges; if we saw in the ultraviolet, we might have a word, say breen, for ‘something which emits or scatters much more light between 260 and 285 nm than between 200 and 260 nm or between 285 and 350 nm’. This doesn’t mean that greenness is relative, though.
Likewise, something is good if it leads to sentient beings living, to people being happy, to individuals having the freedom to control their own lives, to minds exploring new territory instead of falling into infinite loops, to the universe having a richness and complexity to it that goes beyond pebble heaps, etc. That’s what goodmeans, and it also applies to places where there are no humans: it still makes sense to ask whether it’s good for Babyeaters to eat their children even though there are no humans on that planet. On the other hand, the reason why we find the concept of ‘something which leads to sentient beings living, to people being happy, to individuals having the freedom to control their own lives, to minds exploring new territory instead of falling into infinite loops, to the universe having a richness and complexity to it that goes beyond pebble heaps, etc.’ important enough to have a word (good) for it is that for evolutionary reasons we value such kind of things; if we valued heaps composed by prime numbers of pebbles, we might have a word, say pood, for ‘something which leads to lots of heaps with a prime number of pebbles in each’. This doesn’t mean that goodness is relative, though.
One part of it that did turn out well, in my opinion, is Probability is Objectively Subjective and related posts. Eliezer’s metaethical theory is, unless I’m mistaken, an effort to do for naive moral intuitions what Bayesianism should do for naive probabilistic intuitions.
I’m pretty sure Eliezer is actually wrong about whether he’s a meta-ethical relativist, mainly because he’s using words in a slightly different way from the way they use them. Or rather, he thinks that MER is using one specific word in a way that isn’t really kosher. (A statement which I think he’s basically correct about, but it’s a purely semantic quibble and so a stupid thing to argue about.)
Basically, Eliezer is arguing that when he says something is “good” that’s a factual claim with factual content. And he’s right; he means something specific-although-hard-to-compute by that sentence. And similarly, when I say something is “good” that’s another factual claim with factual content, whose truth is at least in theory computable.
But importantly, when Eliezer says something is “good” he doesn’t mean quite the same thing I mean when I say something is “good.” We actually speak slightly different languages in which the word “good” has slightly different meanings. Meta-Ethical Relativism, at least as summarized by wikipedia, describes this fact with the sentence “terms such as “good,” “bad,” “right” and “wrong” do not stand subject to universal truth conditions at all.” Eliezer doesn’t like that because in each speaker’s language, terms like “good” stand subject to universal truth conditions. But each speaker speaks a slightly different language where the truth conditions on the word represented by the string “good” stands subject to a slightly different set of universal truth conditions.
For an analogy: I apparently consistently define “blonde” differently from almost everyone I know. But it has an actual definition. When I call someone “blonde” I know what I mean, and people who know me well know what I mean. But it’s a different thing from what almost everyone else means when they say “blonde.” (I don’t know why I can’t fix this; I think my color perception is kinda screwed up). An MER guy would say that whether someone is “blonde” isn’t objectively true or false because what it means varies from speaker to speaker. Eliezer would say that “blonde” has a meaning in my language and a different meaning in my friends’ language, but in either language whether a person is “blonde” is in fact an objective fact.
And, you know, he’s right. But we’re not very good at discussing phenomena where two different people speak the same language except one or two words have different meanings; it’s actually a thing that’s hard to talk about. So in practice, “‘good’ doesn’t have an objective definition” conveys my meaning more accurately to the average listener than “‘good’ has one objective meaning in my language and a different objective meaning in your language.”
But importantly, when Eliezer says something is “good” he doesn’t mean quite the same thing I mean when I say something is “good.” We actually speak slightly different languages in which the word “good” has slightly different meaning
In http://lesswrong.com/lw/t0/abstracted_idealized_dynamics/mgr, user steven wrote “When X (an agent) judges that Y (another agent) should Z (take some action, make some decision), X is judging that Z is the solution to the problem W (perhaps increasing a world’s measure under some optimization criterion), where W is a rigid designator for the problem structure implicitly defined by the machinery shared by X and Y which they both use to make desirability judgments. (Or at least X is asserting that it’s shared.) Due to the nature of W, becoming informed will cause X and Y to get closer to the solution of W, but wanting-it-when-informed is not what makes that solution moral.” with which Eliezer agreed.
This means that, even though people might presently have different things in mind when they say something is “good”, Eliezer does not regard their/our/his present ideas as either the meaning of their-form-of-good or his-form-of-good. The meaning of good is not “the things someone/anyone personally, presently finds morally compelling”, but something like “the fixed facts that are found but not defined by clarifying the result of applying the shared human evaluative cognitive machinery to a wide variety of situations under reflectively ideal conditions of information.” That is to say, Eliezer thinks, not only that moral questions are well defined, “objective”, in a realist or cognitivist way, but that our present explicit-moralities all have a single, fixed, external referent which is constructively revealed via the moral computations that weigh our many criteria.
I haven’t finished reading CEV, but here’s a quote from Levels of Organization that seems relevant: “The target matter of Artificial Intelligence is not the surface variation that makes one human slightly smarter than another human, but rather the vast store of complexity that separates a human from an amoeba”. Similarly, the target matter of inferences that figure out the content of morality is not the surface variation of moral intuitions and beliefs under partial information which result in moral disagreements, but the vast store of neural complexity that allows humans to disagree at all, rather than merely be asking different questions.
So the meaning of presently-acted-upon-and-explicitly-stated-rightness in your language, and the meaning of it in my language might be different, but one of the many points of the meta-ethics sequence is that the expanded-enlightened-mature-unfolding of those present usages gives us a single, shared, expanded-meaning in both our languages.
If you still think that moral relativism is a good way to convey that in daily language, fine. It seems the most charitable way in which he could be interpreted as a relativist is if “good” is always in quotes, to denote the present meaning a person attaches to the word. He is a “moral” relativist, and a moral realist/cognitivist/constructivist.
Hm, that sounds plausible, especially your last paragraph. I think my problem is that I don’t see any reason to suspect that the expanded-enlightened-mature-unfolding of our present usages will converge in the way Eliezer wants to use as a definition. See for instance the “repugnant conclusion” debate; people like Peter Singer and Robin Hanson think the repugnant conclusion actually sounds pretty awesome, while Derek Parfit thinks it’s basically a reductio on aggregate utilitarianism as a philosophy and I’m pretty sure Eliezer agrees with him, and has more or less explicitly identified it as a failure mode of AI development. I doubt these are beliefs that really converge with more information and reflection.
Or in steven’s formulation, I suspect that relatively few agents actually have Ws in common; his definition presupposes that there’s a problem structure “implicitly defined by the machinery shared by X and Y which they both use to make desirability judgments”. I’m arguing that many agents have sufficiently different implicit problem structures that, for instance, by that definition Eliezer and Robin Hanson can’t really make “should” statements to each other.
Just getting citations out of the way, Eliezer talked about the repugnant conclusion here and here. He argues for shared W in Psychological Unity and Moral Disagreement. Kaj Sotala wrote a notable reply to Psychological Unity, Psychological Diversity. Finally Coherent Extrapolated Volition is all about finding a way to unfold present-explicit-moralities into that shared-should that he believes in, so I’d expect to see some arguments there.
Now, doesn’t the state of the world today suggest that human explicit-moralities are close enough that we can live together in a Hubble volume without too many wars, without a thousand broken coalitions of support over sides of irreconcilable differences, without blowing ourselves up because the universe would be better with no life than with the evil monsters in that tribe on the other side of the river?
Human concepts are similar enough that we can talk to each other. Human aesthetics are similar enough that there’s a billion dollar video game industry. Human emotions are similar enough that Macbeth is still being produced three hundred years later on the other side of the globe. We have the same anatomical and functional regions in our brains. Parents everywhere use baby talk. On all six populated continents there are countries in which more than half of the population identifies with the Christian religions.
For all those similarities, is humanity really going to be split over the Repugnant Conclusion? Even if the Repugnant Conclusion is more of a challenge than muscling past a few inductive biases (scope insensitivity and the attribute substitution heuristic are also universal), I think we have some decent prospect for a future in which you don’t have to kill me. Whatever will help us to get to that future, that’s what I’m looking for when I say “right”. No matter how small our shared values are once we’ve felt the weight of relevant moral arguments, that’s what we need to find.
This comment may be a little scattered; I apologize. (In particular, much of this discussion is beside the point of my original claim that Eliezer really is a meta-ethical relativist, about which see my last paragraph).
I certainly don’t think we have to escalate to violence. But I do think there are subjects on which we might never come to agreement even given arbitrary time and self-improvement and processing power. Some of these are minor judgments; some are more important. But they’re very real.
In a number of places Eliezer commented that he’s not too worried about, say, two systems morality_1 and morality_2 that differ in the third decimal place. I think it’s actually really interesting when they differ in the third decimal place; it’s probably not important to the project of designing an AI but I don’t find that project terribly interesting so that doesn’t bother me.
But I’m also more willing to say to someone, “”We have nothing to argue about [on this subject], we are only different optimization processes.” With most of my friends I really do have to say this, as far as I can tell, on at least one subject.
However, I really truly don’t think this is as all-or-nothing as you or Eliezer seem to paint it. First, because while morality may be a compact algorithm relative to its output, it can still be pretty big, and disagreeing seriously about one component doesn’t mean you don’t agree about the other several hundred. (A big sticking point between me and my friends is that I think getting angry is in general deeply morally blameworthy, whereas many of them believe that failing to get angry at outrageous things is morally blameworthy; and as far as I can tell this is more or less irreducible in the specification for all of us). But I can still talk to these people and have rewarding conversations on other subjects.
Second, because I realize there are other means of persuasion than argument. You can’t argue someone into changing their terminal values, but you can often persuade them to do so through literature and emotional appeal, largely due to psychological unity. I claim that this is one of the important roles that story-telling plays: it focuses and unifies our moralities through more-or-less arational means. But this isn’t an argument per se and has no particular reason one would expect it to converge to a particular outcome—among other things, the result is highly contingent on what talented artists happen to believe. (See Rorty’s Contingency, Irony, and Solidarity for discussion of this).
Humans have a lot of psychological similarity. They also have some very interesting and deep psychological variation (see e.g. Haidt’s work on the five moral systems). And it’s actually useful to a lot of societies to have variation in moral systems—it’s really useful to have some altruistic punishers, but not really for everyone to be an altruistic punisher.
But really, this is beside the point of the original question, whether Eliezer is really a meta-ethical relativist, because the limit of this sequence which he claims converges isn’t what anyone else is talking about when they say “morality”. Because generally, “morality” is defined more or less to be a consideration that would/should be compelling to all sufficiently complex optimization processes. Eliezer clearly doesn’t believe any such thing exists. And he’s right.
A big sticking point between me and my friends is that I think getting angry is in general deeply morally blameworthy, whereas many of them believe that failing to get angry at outrageous things is morally blameworthy
Your friends can understand why humans have positive personality descriptors for people who don’t get angry in various situations: descriptors like reflective, charming, polite, solemn, respecting, humble, tranquil, agreeable, open-minded, approachable, cooperative, curious, hospitable, sensitive, sympathetic, trusting, merciful, gracious.
You can understand why we have positive personality descriptors for people who get angry in various situations: descriptors like impartial, loyal, decent, passionate, courageous, boldness, leadership, strength, resilience, candor, vigilance, independence, reputation, and dignity.
Both you and your friends can see how either group could pattern match their behavioral bias as being friendly, supportive, mature, disciplined, or prudent.
These are not deep variations, they are relative strengths of reliance on the exact same intuitions.
You can’t argue someone into changing their terminal values, but you can often persuade them to do so through literature and emotional appeal, largely due to psychological unity. I claim that this is one of the important roles that story-telling plays: it focuses and unifies our moralities through more-or-less arational means. But this isn’t an argument per se and has no particular reason one would expect it to converge to a particular outcome—among other things, the result is highly contingent on what talented artists happen to believe.
Stories strengthen our associations of different emotions in response to analogous situations, which doesn’t have much of a converging effect (Edit: unless, you know, it’s something like the bible that a billion people read. That certainly pushes humanity in some direction), but they can also create associations to moral evaluative machinery that previously wasn’t doing its job. There’s nothing arational about this: neurons firing in the inferior frontal gyrus are evidence relevant to a certain useful categorizing inference, “things which are sentient”.
Because generally, “morality” is defined more or less to be a consideration that would/should be compelling to all sufficiently complex optimization processes
I’m not in a mood to argue definitions, but “optimization process” is a very new concept, so I’d lean toward “less”.
You’re...very certain of what I understand. And of the implications of that understanding.
More generally, you’re correct that people don’t have a lot of direct access to their moral intuitions. But I don’t actually see any evidence for the proposition they should converge sufficiently other than a lot of handwaving about the fundamental psychological similarity of humankind, which is more-or-less true but probably not true enough. In contrast, I’ve seen lots of people with deeply, radically separated moral beliefs, enough so that it seems implausible that these all are attributable to computational error.
I’m not disputing that we share a lot of mental circuitry, or that we can basically understand each other. But we can understand without agreeing, and be similar without being the same.
As for the last bit—I don’t want to argue definitions either. It’s a stupid pastime. But to the extent Eliezer claims not to be a meta-ethical relativist he’s doing it purely through a definitional argument.
He does intend to convey something real and nontrivial (well, some people might find it trivial, but enough people don’t that it is important to be explicit) by saying that he is not a meta-ethical realist. The basic idea is that, while his brain is the causal reason for him wanting to do certain things, it is not referenced in the abstract computation that defines what is right. To use a metaphor from the meta-ethics sequence, it is a fact about a calculator that it is computing 1234 * 5678, but the fact that 1234 * 5678 = 7 006 652 is not a fact about that calculator.
This distinguishes him from some types of relativism, which I would guess to be the most common types. I am unsure whether people understand that he is trying to draw this distinction and still think that it is misleading to say that he is not a moral relativist or whether people are confused/have a different explanation for why he does not identify as a relativist.
In contrast, I’ve seen lots of people with deeply, radically separated moral beliefs, enough so that it seems implausible that these all are attributable to computational error.
The claim wasn’t that it happens too often to attribute to computation error, but that the types of differences seem unlikely to stem from computational errors.
The problem is, EY may just be contradicting himself, or he may be being ambiguous, and even deliberately so.
“what is right is a huge computational property—an abstract computation—not tied to the state of anyone’s brain, including your own brain.”
I think his views could be clarified in a moment if he stated clearly whether this abstract computation is identical for everyone. Is it AC_219387209 for all of us, or AC_42398732 for you, and AC_23479843 for me, with the proviso that it might be the case that AC_42398732 = AC_23479843?
Your quote makes it appear the former.Other quotes in this thread about a “shared W” point to that as well.
Then again, quotes in the same article make it appear the latter, as in:
If you hoped that morality would be universalizable—sorry, that one I really can’t give back. Well, unless we’re just talking about humans. Between neurologically intact humans, there is indeed much cause to hope for overlap and coherence;
We’re all busy playing EY Exegesis. Doesn’t that strike anyone else as peculiar? He’s not dead. He’s on the list. And he knows enough about communication and conceptualization to have been clear in the first place. And yet on such a basic point, what he writes seems to go round and round and we’re not clear what the answer is. And this, after years of opportunity for clarification.
It brings to mind Quirrell:
“But if your question is why I told them that, Mr. Potter, the answer is that you will find ambiguity a great ally on your road to power. Give a sign of Slytherin on one day, and contradict it with a sign of Gryffindor the next; and the Slytherins will be enabled to believe what they wish, while the Gryffindors argue themselves into supporting you as well. So long as there is uncertainty, people can believe whatever seems to be to their own advantage. And so long as you appear strong, so long as you appear to be winning, their instincts will tell them that their advantage lies with you. Walk always in the shadow, and light and darkness both will follow.”
If you’re trying to convince people of your morality, and they have already picked teams, there is an advantage in letting it appear to each that they haven’t really changed sides.
Ah, neat, you found exactly what it is. Although the LW version is a bit stronger, since it involves thoughts like “the cause of me thinking some things are moral does not come from interacting with some mysterious substance of moralness.”
I mean, I can accept “the answer is there is no answer” (just as there is no point to existence of itself, we’re just here and have to work out what to do for ourselves). It just seems rather a lot of text to get that across.
Well, just because there is no moral argument that will convince any possible intelligence doesn’t mean there’s nothing left to explore. For example, you might apply the “what words mean” posts to explore what people mean when they say “do the right thing,” and how to program that into an AI :P
All questions about the morality of actions can be restated as questions about the moral value of the states of the world that those actions give rise to.
All questions about the moral value of the states of the world can in principle be answered by evaluating those world-states in terms of the various things we’ve evolved to value, although actually performing that evaluation is difficult.
Questions about whether the moral value of states of the world should be evaluated in terms of the things we’ve evolved to value, as opposed to evaluated in terms of something else, can be answered by pointing out that the set of things we’ve evolved to value is what right means and is therefore definitionally the right set of things to use.
I consider that third point kind of silly, incidentally.
Well, it works OK if you give up on the idea that “right” has some other meaning, which he spent rather a long time in that sequence trying to convince people to give up on. So perhaps that’s the piece that failed to work.
I mean, once you get rid of that idea, then saying that “right” means the values we all happen to have (positing that there actually is some set of values X such that we all have X) is rather a lot like saying a meter is the distance light travels in 1 ⁄ 299,792,458 of a second… it’s arbitrary, sure, but it’s not unreasonable.
Personally, I would approach it from the other direction. “Maybe X is right, maybe it isn’t, maybe both, maybe neither. What does it matter? How would you ever tell? What is added to the discussion by talking about it? X is what we value; it would be absurd to optimize for anything else. We evaluate in terms of what we care about because we care about it; to talk about it being “right” or “not right,” insofar as those words don’t mean “what we value” and “what we don’t value”, adds nothing to the discussion.”
But saying that requires me to embrace a certain kind of pragmatism that is, er, socially problematic to be seen embracing.
Morality is a sense, similar to taste or vision. If I eat a food, I can react by going ‘yummy’ or ‘blech’. If I observe an action, I can react by going ‘good’ or ‘evil’.
Just like your other senses, it’s not 100% reliable. Kids eventually learn that while candy is ‘yummy’, eating nothing but candy is ‘blech’ - your first-order sensory data is being corrected by a higher-order understanding (whether this be “eating candy is nutritionally bad” or “I get a stomach ache on days I just eat candy”).
The above paragraph ties in with the idea of “The lens that sees its flaws”. We can’t build a model of “right and wrong” from scratch any more than we could build a sense of yumminess from scratch; you have to work with the actual sensory input you have. To return to the food analogy, a diet consisting of ostensibly ideal food, but which lacks ‘yumminess’, will fail because almost no one can actually keep to it. Equally, our morality has to be based in our actual gut reaction of ‘goodness’ - you can’t just define a mathematical model and expect people to follow it.
Finally, and most important to the idea of “CEV”, is the idea that, just as science leads us to a greater understanding of nutrition and what actually works for us, we can also work towards a scientific understanding of morality. As an example, while ‘revenge’ is a very emotionally-satisfying tactic, it’s not always an effective tactic; just like candy, it’s something that needs to be understood and used in moderation.
Part of growing up as a kid is learning to eat right. Part of growing up as a society is learning to moralize correctly :)
From Bury the Chains, the idea that slavery was wrong hit England as a surprise. Quakers and Evangelicals were opposed to slavery, but the general public went from oblivious to involved very quickly.
It can mean you value short-term reactions instead of long-term consequences. A better analogy would be flavor: candy tastes delicious, but it’s long-term consequences are undesirable. In this case, a flawed morality leads you to conclude that because something registers as ‘righteous’ (say, slaying all the unbelievers), you should go ahead and do it, without realizing the consequences (“because this made everyone hate us, we have even less ability to slay/convert future infidels”)
On another level, one can also realize that values conflict (“I really like the taste of soda, but it makes my stomach upset!”) → (“I really like killing heretics, but isn’t murder technically a sin?”)
Edit: There’s obviously numerous other flaws that can occur (you might not notice that something is “evil” until you’ve done it and are feeling remorse, to try and more tightly parallel your example). This isn’t meant to be comprehensive :)
I’ve read the metaethics sequence twice and am still unclear on what the basic points it’s trying to get across are. (I read it and get to the end and wonder where the “there” is there. What I got from it is “our morality is what we evolved, and humans are all we have therefore it is fundamentally good and therefore it deserves to control the entire future”, which sounds silly when I put it like that.) Would anyone dare summarise it?
Morality is good because goals like joy and beauty are good. (For qualifications, see Appendices A through OmegaOne.) This seems like a tautology, meaning that if we figure out the definition of morality it will contain a list of “good” goals like those. We evolved to care about goodness because of events that could easily have turned out differently, in which case “we” would care about some other list. But, and here it gets tricky, our Good function says we shouldn’t care about that other list. The function does not recognize evolutionary causes as reason to care. In fact, it does not contain any representation of itself. This is a feature. We want the future to contain joy, beauty, etc, not just ‘whatever humans want at the time,’ because an AI or similar genie could and probably would change what we want if we told it to produce the latter.
Okay, now this definitely sounds like standard moral relativism to me. It’s just got the caveat that obviously we endorse our own version of morality, and that’s the ground on which we make our moral judgements. Which is known as appraiser relativism.
I must confess I do not understand what you just said at all. Specifically:
the second sentence: could you please expand on that?
I think I get that the function does not evaluate itself at all, and if you ask it just says “it’s just good ’cos it is, all right?”
Why is this a feature? (I suspect the password is “Löb’s theorem”, and only almost understand why.)
The last bit appears to be what I meant by “therefore it deserves to control the entire future.” It strikes me as insufficient reason to conclude that this can in no way be improved, ever.
Does the sequence show a map of how to build metamorality from the ground up, much as writing the friendly AI will need to work from the ground up?
I’ll try: any claim that a fundamental/terminal moral goal ‘is good’ reduces to a tautology on this view, because “good” doesn’t have anything to it besides these goals. The speaker’s definition of goodness makes every true claim of this kind true by definition. (Though the more practical statements involve inference. I started to say it must be all logical inference, realized EY could not possibly have said that, and confirmed that in fact he did not.)
Though technically it may see the act of caring about goodness as good. So I have to qualify what I said before that way.
Because if the function could look at the mechanical, causal steps it takes, and declare them perfectly reliable, it would lead to a flat self-contradiction by Lob’s Theorem. The other way looks like a contradiction but isn’t. (We think.)
Thank you, this helps a lot.
Ooh yeah, didn’t spot that one. (As someone who spent a lot of time when younger thinking about this and trying to be a good person, I certainly should have spotted this.)
This comment by Richard Chappell explained clearly and concisely Eliezer’s metaethical views. It was very highly upvoted, so apparently the collective wisdom of the community considered it accurate. It didn’t receive an explicit endorsement by Eliezer, though.
From the comment by Richard Chappell:
People really think EY is saying this? It looks to me like a basic Egoist stance, where “your values” also include your moral preferences. That is my position, but I don’t think EY is on board.
“Shut up and multiply” implies a symmetry in value between different people that isn’t implied by the above. Similarly, the diversion into mathematical idealization seemed like a maneuver toward Objective Morality—One Algorithm to Bind Them, One Algorithm to Rule them All. Everyone gets their own algorithm as the standard of right and wrong? Fantastic, if it were true, but that’s not how I read EY.
It’s strange, because Richard seems to say that EY agrees with me, while I think EY agrees with him.
I think you are mixing up object-level ethics and metaethics here. You seem to be contrasting an Egoist position (“everyone should do what they want”) with an impersonal utilitarian one (“everyone should do what is good for everyone, shutting up and multiplying”). But the dispute is about what “should”, “right” and related words mean, not about what should be done.
Eliezer (in Richard’s interpretation) says that when someone says “Action A is right” (or “should be done”), the meaning of this is roughly “A promotes ultimate goals XYZ”. Here XYZ is in fact the outcome of a complicated computation based from of the speaker’s state of mind, which can be translated roughly as “the speaker’s terminal values” (for example, for a sincere philanthropist XYZ might be “everyone gets joy, happiness, freedom, etc”). But the fact that XYZ are the speaker’s terminal values is not part of the meaning of “right”, so it is not inconsistent for someone to say “Everyone should promote XYZ, even if they don’t want it” (e.g. “Babyeaters should not eat babies”). And needless to say, XYZ might include generalized utilitarian values like “everyone gets their preferences satisfied”, in which case impersonal, shut-up-and-multiply utilitarianism is what is needed to make actual decisions for concrete cases.
Of course it’s about both. You can define labels in any way you like. In the end, your definition better be useful for communicating concepts with other people, or it’s not a good definition.
Let’s define “yummy”. I put food in my mouth. Taste buds fire, neural impulses propagate fro neuron to neuron, and eventually my mind evaluates how yummy it is. Similar events happen for you. Your taste buds fire, your neural impulses propagate, and your mind evaluates how yummy it is. Your taste buds are not mine, and your neural networks are not mine, so your response and my response are not identical. If I make a definition of “yummy” that entails that what you find yummy is not in fact yummy, I’ve created a definition that is useless for dealing with the reality of what you find yummy.
From my inside view of yummy, of course you’re just wrong if you think root beer isn’t yummy—I taste root beer, and it is yummy. But being a conceptual creature, I have more than the inside view, I have an outside view as well, of you, and him, and her, and ultimately of me too. So when I talk about yummy with other people, I recognize that their inside view is not identical to mine, and so use a definition based on the outside view, so that we can actually be talking about the same thing, instead of throwing our differing inside views at each other.
Discussion with the inside view: “Let’s get root beer.” “What? Root beer sucks!” “Root beer is yummy!” “Is not!” “Is too!”
Discussion with the outside view: “Let’s get root beer.” “What? Root beer sucks!” “You don’t find root beer yummy?” “No. Blech.” “OK, I’m getting a root beer.” “And I pick pepsi.”
If you’ve tied yourself up in conceptual knots, and concluded that root beer really isn’t yummy for me, even though my yummy detector fires whenever I have root beer, you’re just confused and not talking about reality.
This is the problem. You’ve divorced your definition from the relevant part of reality—the speaker’s terminal values, and somehow twisted it around to where what he *should” do is at odds with his terminal values. This definition is not useful for discussing moral issues with the given speaker. He’s a machine that maximizes his terminal values. If his algorithms are functioning properly, he’ll disregard your definition as irrelevant to achieving his ends. Whether from the inside view of morality for that speaker, or his outside view, you’re just wrong. And you’re also wrong from any outside view that accurately models what terminal values people actually have.
Rational discussions of morality start with the observation that people have differing terminal values. Our terminal values are our ultimate biases. Recognizing that my biases are mine, and not identical to yours, is the first step away from the usual useless babble in moral philosophy.
Shifting the lump under the rug but not getting rid of it is how it looks to me too. But I don’t understand the rest of that comment and will need to think harder about it (when I’m less sleep-deprived).
I note that that’s the comment Lukeprog flagged as his favourite answer, but of course I can’t tell if it got the upvotes before or after he did so.
Let me try...
Something is green if it emits or scatters much more light between 520 and 570 nm than between 400 and 520 nm or between 570 and 700 nm. That’s what green means, and it also applies to places where there are no humans: it still makes sense to ask whether the skin of tyrannosaurs was green even though there were no humans back then. On the other hand, the reason why we find the concept of ‘something which emits or scatters much more light between 520 and 570 nm than between 400 and 520 nm or between 570 and 700 nm’ important enough to have a word (green) for it is that for evolutionary reasons we have cone cells which work in those ranges; if we saw in the ultraviolet, we might have a word, say breen, for ‘something which emits or scatters much more light between 260 and 285 nm than between 200 and 260 nm or between 285 and 350 nm’. This doesn’t mean that greenness is relative, though.
Likewise, something is good if it leads to sentient beings living, to people being happy, to individuals having the freedom to control their own lives, to minds exploring new territory instead of falling into infinite loops, to the universe having a richness and complexity to it that goes beyond pebble heaps, etc. That’s what good means, and it also applies to places where there are no humans: it still makes sense to ask whether it’s good for Babyeaters to eat their children even though there are no humans on that planet. On the other hand, the reason why we find the concept of ‘something which leads to sentient beings living, to people being happy, to individuals having the freedom to control their own lives, to minds exploring new territory instead of falling into infinite loops, to the universe having a richness and complexity to it that goes beyond pebble heaps, etc.’ important enough to have a word (good) for it is that for evolutionary reasons we value such kind of things; if we valued heaps composed by prime numbers of pebbles, we might have a word, say pood, for ‘something which leads to lots of heaps with a prime number of pebbles in each’. This doesn’t mean that goodness is relative, though.
I have recently read this post and thought it describes very well how I always thought about morality, even though it talks about ‘sexiness’.
Would reading the metaethics sequence explain to me that it would be wrong to view morality in a similar fashion as sexiness?
Yes.
One part of it that did turn out well, in my opinion, is Probability is Objectively Subjective and related posts. Eliezer’s metaethical theory is, unless I’m mistaken, an effort to do for naive moral intuitions what Bayesianism should do for naive probabilistic intuitions.
I think it’s just Meta-ethical moral relativism.
“I am not a moral relativist.” http://lesswrong.com/lw/t9/no_license_to_be_human/
“I am not a meta-ethical relativist” http://lesswrong.com/lw/t3/the_bedrock_of_morality_arbitrary/mj4
“what is right is a huge computational property—an abstract computation—not tied to the state of anyone’s brain, including your own brain.” http://lesswrong.com/lw/sm/the_meaning_of_right/
I’m pretty sure Eliezer is actually wrong about whether he’s a meta-ethical relativist, mainly because he’s using words in a slightly different way from the way they use them. Or rather, he thinks that MER is using one specific word in a way that isn’t really kosher. (A statement which I think he’s basically correct about, but it’s a purely semantic quibble and so a stupid thing to argue about.)
Basically, Eliezer is arguing that when he says something is “good” that’s a factual claim with factual content. And he’s right; he means something specific-although-hard-to-compute by that sentence. And similarly, when I say something is “good” that’s another factual claim with factual content, whose truth is at least in theory computable.
But importantly, when Eliezer says something is “good” he doesn’t mean quite the same thing I mean when I say something is “good.” We actually speak slightly different languages in which the word “good” has slightly different meanings. Meta-Ethical Relativism, at least as summarized by wikipedia, describes this fact with the sentence “terms such as “good,” “bad,” “right” and “wrong” do not stand subject to universal truth conditions at all.” Eliezer doesn’t like that because in each speaker’s language, terms like “good” stand subject to universal truth conditions. But each speaker speaks a slightly different language where the truth conditions on the word represented by the string “good” stands subject to a slightly different set of universal truth conditions.
For an analogy: I apparently consistently define “blonde” differently from almost everyone I know. But it has an actual definition. When I call someone “blonde” I know what I mean, and people who know me well know what I mean. But it’s a different thing from what almost everyone else means when they say “blonde.” (I don’t know why I can’t fix this; I think my color perception is kinda screwed up). An MER guy would say that whether someone is “blonde” isn’t objectively true or false because what it means varies from speaker to speaker. Eliezer would say that “blonde” has a meaning in my language and a different meaning in my friends’ language, but in either language whether a person is “blonde” is in fact an objective fact.
And, you know, he’s right. But we’re not very good at discussing phenomena where two different people speak the same language except one or two words have different meanings; it’s actually a thing that’s hard to talk about. So in practice, “‘good’ doesn’t have an objective definition” conveys my meaning more accurately to the average listener than “‘good’ has one objective meaning in my language and a different objective meaning in your language.”
In http://lesswrong.com/lw/t0/abstracted_idealized_dynamics/mgr, user steven wrote “When X (an agent) judges that Y (another agent) should Z (take some action, make some decision), X is judging that Z is the solution to the problem W (perhaps increasing a world’s measure under some optimization criterion), where W is a rigid designator for the problem structure implicitly defined by the machinery shared by X and Y which they both use to make desirability judgments. (Or at least X is asserting that it’s shared.) Due to the nature of W, becoming informed will cause X and Y to get closer to the solution of W, but wanting-it-when-informed is not what makes that solution moral.” with which Eliezer agreed.
This means that, even though people might presently have different things in mind when they say something is “good”, Eliezer does not regard their/our/his present ideas as either the meaning of their-form-of-good or his-form-of-good. The meaning of good is not “the things someone/anyone personally, presently finds morally compelling”, but something like “the fixed facts that are found but not defined by clarifying the result of applying the shared human evaluative cognitive machinery to a wide variety of situations under reflectively ideal conditions of information.” That is to say, Eliezer thinks, not only that moral questions are well defined, “objective”, in a realist or cognitivist way, but that our present explicit-moralities all have a single, fixed, external referent which is constructively revealed via the moral computations that weigh our many criteria.
I haven’t finished reading CEV, but here’s a quote from Levels of Organization that seems relevant: “The target matter of Artificial Intelligence is not the surface variation that makes one human slightly smarter than another human, but rather the vast store of complexity that separates a human from an amoeba”. Similarly, the target matter of inferences that figure out the content of morality is not the surface variation of moral intuitions and beliefs under partial information which result in moral disagreements, but the vast store of neural complexity that allows humans to disagree at all, rather than merely be asking different questions.
So the meaning of presently-acted-upon-and-explicitly-stated-rightness in your language, and the meaning of it in my language might be different, but one of the many points of the meta-ethics sequence is that the expanded-enlightened-mature-unfolding of those present usages gives us a single, shared, expanded-meaning in both our languages.
If you still think that moral relativism is a good way to convey that in daily language, fine. It seems the most charitable way in which he could be interpreted as a relativist is if “good” is always in quotes, to denote the present meaning a person attaches to the word. He is a “moral” relativist, and a moral realist/cognitivist/constructivist.
Hm, that sounds plausible, especially your last paragraph. I think my problem is that I don’t see any reason to suspect that the expanded-enlightened-mature-unfolding of our present usages will converge in the way Eliezer wants to use as a definition. See for instance the “repugnant conclusion” debate; people like Peter Singer and Robin Hanson think the repugnant conclusion actually sounds pretty awesome, while Derek Parfit thinks it’s basically a reductio on aggregate utilitarianism as a philosophy and I’m pretty sure Eliezer agrees with him, and has more or less explicitly identified it as a failure mode of AI development. I doubt these are beliefs that really converge with more information and reflection.
Or in steven’s formulation, I suspect that relatively few agents actually have Ws in common; his definition presupposes that there’s a problem structure “implicitly defined by the machinery shared by X and Y which they both use to make desirability judgments”. I’m arguing that many agents have sufficiently different implicit problem structures that, for instance, by that definition Eliezer and Robin Hanson can’t really make “should” statements to each other.
Just getting citations out of the way, Eliezer talked about the repugnant conclusion here and here. He argues for shared W in Psychological Unity and Moral Disagreement. Kaj Sotala wrote a notable reply to Psychological Unity, Psychological Diversity. Finally Coherent Extrapolated Volition is all about finding a way to unfold present-explicit-moralities into that shared-should that he believes in, so I’d expect to see some arguments there.
Now, doesn’t the state of the world today suggest that human explicit-moralities are close enough that we can live together in a Hubble volume without too many wars, without a thousand broken coalitions of support over sides of irreconcilable differences, without blowing ourselves up because the universe would be better with no life than with the evil monsters in that tribe on the other side of the river?
Human concepts are similar enough that we can talk to each other. Human aesthetics are similar enough that there’s a billion dollar video game industry. Human emotions are similar enough that Macbeth is still being produced three hundred years later on the other side of the globe. We have the same anatomical and functional regions in our brains. Parents everywhere use baby talk. On all six populated continents there are countries in which more than half of the population identifies with the Christian religions.
For all those similarities, is humanity really going to be split over the Repugnant Conclusion? Even if the Repugnant Conclusion is more of a challenge than muscling past a few inductive biases (scope insensitivity and the attribute substitution heuristic are also universal), I think we have some decent prospect for a future in which you don’t have to kill me. Whatever will help us to get to that future, that’s what I’m looking for when I say “right”. No matter how small our shared values are once we’ve felt the weight of relevant moral arguments, that’s what we need to find.
This comment may be a little scattered; I apologize. (In particular, much of this discussion is beside the point of my original claim that Eliezer really is a meta-ethical relativist, about which see my last paragraph).
I certainly don’t think we have to escalate to violence. But I do think there are subjects on which we might never come to agreement even given arbitrary time and self-improvement and processing power. Some of these are minor judgments; some are more important. But they’re very real.
In a number of places Eliezer commented that he’s not too worried about, say, two systems morality_1 and morality_2 that differ in the third decimal place. I think it’s actually really interesting when they differ in the third decimal place; it’s probably not important to the project of designing an AI but I don’t find that project terribly interesting so that doesn’t bother me.
But I’m also more willing to say to someone, “”We have nothing to argue about [on this subject], we are only different optimization processes.” With most of my friends I really do have to say this, as far as I can tell, on at least one subject.
However, I really truly don’t think this is as all-or-nothing as you or Eliezer seem to paint it. First, because while morality may be a compact algorithm relative to its output, it can still be pretty big, and disagreeing seriously about one component doesn’t mean you don’t agree about the other several hundred. (A big sticking point between me and my friends is that I think getting angry is in general deeply morally blameworthy, whereas many of them believe that failing to get angry at outrageous things is morally blameworthy; and as far as I can tell this is more or less irreducible in the specification for all of us). But I can still talk to these people and have rewarding conversations on other subjects.
Second, because I realize there are other means of persuasion than argument. You can’t argue someone into changing their terminal values, but you can often persuade them to do so through literature and emotional appeal, largely due to psychological unity. I claim that this is one of the important roles that story-telling plays: it focuses and unifies our moralities through more-or-less arational means. But this isn’t an argument per se and has no particular reason one would expect it to converge to a particular outcome—among other things, the result is highly contingent on what talented artists happen to believe. (See Rorty’s Contingency, Irony, and Solidarity for discussion of this).
Humans have a lot of psychological similarity. They also have some very interesting and deep psychological variation (see e.g. Haidt’s work on the five moral systems). And it’s actually useful to a lot of societies to have variation in moral systems—it’s really useful to have some altruistic punishers, but not really for everyone to be an altruistic punisher.
But really, this is beside the point of the original question, whether Eliezer is really a meta-ethical relativist, because the limit of this sequence which he claims converges isn’t what anyone else is talking about when they say “morality”. Because generally, “morality” is defined more or less to be a consideration that would/should be compelling to all sufficiently complex optimization processes. Eliezer clearly doesn’t believe any such thing exists. And he’s right.
Calling something a terminal value is the default behavior when humans look for a justification and don’t find anything. This happens because we perceive little of our own mental processes and in the absence of that information we form post-hoc rationalizations. In short, we know very little about our own values. But that lack of retrieved / constructed justification doesn’t mean it’s impossible to unpack moral intuitions into algorithms so that we can more fully debate which factors we recognize and find relevant.
Your friends can understand why humans have positive personality descriptors for people who don’t get angry in various situations: descriptors like reflective, charming, polite, solemn, respecting, humble, tranquil, agreeable, open-minded, approachable, cooperative, curious, hospitable, sensitive, sympathetic, trusting, merciful, gracious.
You can understand why we have positive personality descriptors for people who get angry in various situations: descriptors like impartial, loyal, decent, passionate, courageous, boldness, leadership, strength, resilience, candor, vigilance, independence, reputation, and dignity.
Both you and your friends can see how either group could pattern match their behavioral bias as being friendly, supportive, mature, disciplined, or prudent.
These are not deep variations, they are relative strengths of reliance on the exact same intuitions.
Stories strengthen our associations of different emotions in response to analogous situations, which doesn’t have much of a converging effect (Edit: unless, you know, it’s something like the bible that a billion people read. That certainly pushes humanity in some direction), but they can also create associations to moral evaluative machinery that previously wasn’t doing its job. There’s nothing arational about this: neurons firing in the inferior frontal gyrus are evidence relevant to a certain useful categorizing inference, “things which are sentient”.
I’m not in a mood to argue definitions, but “optimization process” is a very new concept, so I’d lean toward “less”.
You’re...very certain of what I understand. And of the implications of that understanding.
More generally, you’re correct that people don’t have a lot of direct access to their moral intuitions. But I don’t actually see any evidence for the proposition they should converge sufficiently other than a lot of handwaving about the fundamental psychological similarity of humankind, which is more-or-less true but probably not true enough. In contrast, I’ve seen lots of people with deeply, radically separated moral beliefs, enough so that it seems implausible that these all are attributable to computational error.
I’m not disputing that we share a lot of mental circuitry, or that we can basically understand each other. But we can understand without agreeing, and be similar without being the same.
As for the last bit—I don’t want to argue definitions either. It’s a stupid pastime. But to the extent Eliezer claims not to be a meta-ethical relativist he’s doing it purely through a definitional argument.
He does intend to convey something real and nontrivial (well, some people might find it trivial, but enough people don’t that it is important to be explicit) by saying that he is not a meta-ethical realist. The basic idea is that, while his brain is the causal reason for him wanting to do certain things, it is not referenced in the abstract computation that defines what is right. To use a metaphor from the meta-ethics sequence, it is a fact about a calculator that it is computing 1234 * 5678, but the fact that 1234 * 5678 = 7 006 652 is not a fact about that calculator.
This distinguishes him from some types of relativism, which I would guess to be the most common types. I am unsure whether people understand that he is trying to draw this distinction and still think that it is misleading to say that he is not a moral relativist or whether people are confused/have a different explanation for why he does not identify as a relativist.
Do you know anyone who never makes computational errors? If ‘mistakes’ happen at all, we would expect to see them in cases involving tribal loyalties. See von Neumann and those who trusted him on hidden variables.
The claim wasn’t that it happens too often to attribute to computation error, but that the types of differences seem unlikely to stem from computational errors.
The problem is, EY may just be contradicting himself, or he may be being ambiguous, and even deliberately so.
I think his views could be clarified in a moment if he stated clearly whether this abstract computation is identical for everyone. Is it AC_219387209 for all of us, or AC_42398732 for you, and AC_23479843 for me, with the proviso that it might be the case that AC_42398732 = AC_23479843?
Your quote makes it appear the former.Other quotes in this thread about a “shared W” point to that as well.
Then again, quotes in the same article make it appear the latter, as in:
We’re all busy playing EY Exegesis. Doesn’t that strike anyone else as peculiar? He’s not dead. He’s on the list. And he knows enough about communication and conceptualization to have been clear in the first place. And yet on such a basic point, what he writes seems to go round and round and we’re not clear what the answer is. And this, after years of opportunity for clarification.
It brings to mind Quirrell:
If you’re trying to convince people of your morality, and they have already picked teams, there is an advantage in letting it appear to each that they haven’t really changed sides.
Ah, neat, you found exactly what it is. Although the LW version is a bit stronger, since it involves thoughts like “the cause of me thinking some things are moral does not come from interacting with some mysterious substance of moralness.”
That’s it? That’s the whole takeaway?
I mean, I can accept “the answer is there is no answer” (just as there is no point to existence of itself, we’re just here and have to work out what to do for ourselves). It just seems rather a lot of text to get that across.
Well, just because there is no moral argument that will convince any possible intelligence doesn’t mean there’s nothing left to explore. For example, you might apply the “what words mean” posts to explore what people mean when they say “do the right thing,” and how to program that into an AI :P
My summary is pretty close to yours.
I would summarize it as:
All questions about the morality of actions can be restated as questions about the moral value of the states of the world that those actions give rise to.
All questions about the moral value of the states of the world can in principle be answered by evaluating those world-states in terms of the various things we’ve evolved to value, although actually performing that evaluation is difficult.
Questions about whether the moral value of states of the world should be evaluated in terms of the things we’ve evolved to value, as opposed to evaluated in terms of something else, can be answered by pointing out that the set of things we’ve evolved to value is what right means and is therefore definitionally the right set of things to use.
I consider that third point kind of silly, incidentally.
Yeah, that’s the bit that looks like begging the question. The sequence seems to me to fail to build its results from atoms.
Well, it works OK if you give up on the idea that “right” has some other meaning, which he spent rather a long time in that sequence trying to convince people to give up on. So perhaps that’s the piece that failed to work.
I mean, once you get rid of that idea, then saying that “right” means the values we all happen to have (positing that there actually is some set of values X such that we all have X) is rather a lot like saying a meter is the distance light travels in 1 ⁄ 299,792,458 of a second… it’s arbitrary, sure, but it’s not unreasonable.
Personally, I would approach it from the other direction. “Maybe X is right, maybe it isn’t, maybe both, maybe neither. What does it matter? How would you ever tell? What is added to the discussion by talking about it? X is what we value; it would be absurd to optimize for anything else. We evaluate in terms of what we care about because we care about it; to talk about it being “right” or “not right,” insofar as those words don’t mean “what we value” and “what we don’t value”, adds nothing to the discussion.”
But saying that requires me to embrace a certain kind of pragmatism that is, er, socially problematic to be seen embracing.
Morality is a sense, similar to taste or vision. If I eat a food, I can react by going ‘yummy’ or ‘blech’. If I observe an action, I can react by going ‘good’ or ‘evil’.
Just like your other senses, it’s not 100% reliable. Kids eventually learn that while candy is ‘yummy’, eating nothing but candy is ‘blech’ - your first-order sensory data is being corrected by a higher-order understanding (whether this be “eating candy is nutritionally bad” or “I get a stomach ache on days I just eat candy”).
The above paragraph ties in with the idea of “The lens that sees its flaws”. We can’t build a model of “right and wrong” from scratch any more than we could build a sense of yumminess from scratch; you have to work with the actual sensory input you have. To return to the food analogy, a diet consisting of ostensibly ideal food, but which lacks ‘yumminess’, will fail because almost no one can actually keep to it. Equally, our morality has to be based in our actual gut reaction of ‘goodness’ - you can’t just define a mathematical model and expect people to follow it.
Finally, and most important to the idea of “CEV”, is the idea that, just as science leads us to a greater understanding of nutrition and what actually works for us, we can also work towards a scientific understanding of morality. As an example, while ‘revenge’ is a very emotionally-satisfying tactic, it’s not always an effective tactic; just like candy, it’s something that needs to be understood and used in moderation.
Part of growing up as a kid is learning to eat right. Part of growing up as a society is learning to moralize correctly :)
Having flawed vision means that you might, for example, fail to see an object. What does having flawed morality cause you to be incorrect about?
From Bury the Chains, the idea that slavery was wrong hit England as a surprise. Quakers and Evangelicals were opposed to slavery, but the general public went from oblivious to involved very quickly.
It can mean you value short-term reactions instead of long-term consequences. A better analogy would be flavor: candy tastes delicious, but it’s long-term consequences are undesirable. In this case, a flawed morality leads you to conclude that because something registers as ‘righteous’ (say, slaying all the unbelievers), you should go ahead and do it, without realizing the consequences (“because this made everyone hate us, we have even less ability to slay/convert future infidels”)
On another level, one can also realize that values conflict (“I really like the taste of soda, but it makes my stomach upset!”) → (“I really like killing heretics, but isn’t murder technically a sin?”)
Edit: There’s obviously numerous other flaws that can occur (you might not notice that something is “evil” until you’ve done it and are feeling remorse, to try and more tightly parallel your example). This isn’t meant to be comprehensive :)