The Meaning of Right

Con­tinu­a­tion of: Chang­ing Your Me­taethics, Set­ting Up Me­taethic­s
Fol­lowup to: Does Your Mo­ral­ity Care What You Think?, The Mo­ral Void, Prob­a­bil­ity is Sub­jec­tively Ob­jec­tive, Could Any­thing Be Right?, The Gift We Give To To­mor­row, Re­bel­ling Within Na­ture, Where Re­cur­sive Jus­tifi­ca­tion Hits Bot­tom, …

(The cul­mi­na­tion of a long se­ries of Over­com­ing Bias posts; if you start here, I ac­cept no re­spon­si­bil­ity for any re­sult­ing con­fu­sion, mi­s­un­der­stand­ing, or un­nec­es­sary angst.)

What is moral­ity? What does the word “should”, mean? The many pieces are in place: This ques­tion I shall now dis­solve.

The key—as it has always been, in my ex­pe­rience so far—is to un­der­stand how a cer­tain cog­ni­tive al­gorithm feels from in­side. Stan­dard pro­ce­dure for right­ing a wrong ques­tion: If you don’t know what right-ness is, then take a step be­neath and ask how your brain la­bels things “right”.

It is not the same ques­tion—it has no moral as­pects to it, be­ing strictly a mat­ter of fact and cog­ni­tive sci­ence. But it is an illu­mi­nat­ing ques­tion. Once we know how our brain la­bels things “right”, per­haps we shall find it eas­ier, af­ter­ward, to ask what is re­ally and truly right.

But with that said—the eas­iest way to be­gin in­ves­ti­gat­ing that ques­tion, will be to jump back up to the level of moral­ity and ask what seems right. And if that seems like too much re­cur­sion, get used to it—the other 90% of the work lies in han­dling re­cur­sion prop­erly.

(Should you find your grasp on mean­ingful­ness wa­ver­ing, at any time fol­low­ing, check Chang­ing Your Me­taethics for the ap­pro­pri­ate pro­phy­lac­tic.)

So! In or­der to in­ves­ti­gate how the brain la­bels things “right”, we are go­ing to start out by talk­ing about what is right. That is, we’ll start out wear­ing our moral­ity-gog­gles, in which we con­sider moral­ity-as-moral­ity and talk about moral ques­tions di­rectly. As op­posed to wear­ing our re­duc­tion-gog­gles, in which we talk about cog­ni­tive al­gorithms and mere physics. Ri­gor­ously dis­t­in­guish­ing be­tween these two views is the first step to­ward mat­ing them to­gether.

As a first step, I offer this ob­ser­va­tion, on the level of moral­ity-as-moral­ity: Right­ness is con­ta­gious back­ward in time.

Sup­pose there is a switch, cur­rently set to OFF, and it is morally de­sir­able for this switch to be flipped to ON. Per­haps the switch con­trols the emer­gency halt on a train bear­ing down on a child strapped to the railroad tracks, this be­ing my canon­i­cal ex­am­ple. If this is the case, then, ce­teris paribus and pre­sum­ing the ab­sence of ex­cep­tional con­di­tions or fur­ther con­se­quences that were not ex­plic­itly speci­fied, we may con­sider it right that this switch should be flipped.

If it is right to flip the switch, then it is right to pull a string that flips the switch. If it is good to pull a string that flips the switch, it is right and proper to press a but­ton that pulls the string: Push­ing the but­ton seems to have more should-ness than not push­ing it.

It seems that—all else be­ing equal, and as­sum­ing no other con­se­quences or ex­cep­tional con­di­tions which were not speci­fied—value flows back­ward along ar­rows of causal­ity.

Even in de­on­tolog­i­cal moral­ities, if you’re obli­gated to save the child on the tracks, then you’re obli­gated to press the but­ton. Only very prim­i­tive AI sys­tems have mo­tor out­puts con­trol­led by strictly lo­cal rules that don’t model the fu­ture at all. Duty-based or virtue-based ethics are only slightly less con­se­quen­tial­ist than con­se­quen­tial­ism. It’s hard to say whether mov­ing your arm left or right is more vir­tu­ous with­out talk­ing about what hap­pens next.

Among my read­ers, there may be some who presently as­sert—though I hope to per­suade them oth­er­wise—that the life of a child is of no value to them. If so, they may sub­sti­tute any­thing else that they pre­fer, at the end of the switch, and ask if they should press the but­ton.

But I also sus­pect that, among my read­ers, there are some who won­der if the true moral­ity might be some­thing quite differ­ent from what is presently be­lieved among the hu­man kind. They may find it imag­in­able—plau­si­ble?—that hu­man life is of no value, or nega­tive value. They may won­der if the good­ness of hu­man hap­piness, is as much a self-serv­ing delu­sion as the jus­tice of slav­ery.

I my­self was once num­bered among these skep­tics, be­cause I was always very sus­pi­cious of any­thing that looked self-serv­ing.

Now here’s a lit­tle ques­tion I never thought to ask, dur­ing those years when I thought I knew noth­ing about moral­ity:

Could make sense to have a moral­ity in which, if we should save the child from the train tracks, then we should not flip the switch, should pull the string, and should not push the but­ton, so that, fi­nally, we do not push the but­ton?

Or per­haps some­one says that it is bet­ter to save the child, than to not save them; but doesn’t see why any­one would think this im­plies it is bet­ter to press the but­ton than not press it. (Note the re­sem­blance to the Tor­toise who de­nies modus po­nens.)

It seems imag­in­able, to at least some peo­ple, that en­tirely differ­ent things could be should. It didn’t seem nearly so imag­in­able, at least to me, that should-ness could fail to flow back­ward in time. When I was try­ing to ques­tion ev­ery­thing else, that thought sim­ply did not oc­cur to me.

Can you ques­tion it? Should you?

Every now and then, in the course of hu­man ex­is­tence, we ques­tion what should be done and what is right to do, what is bet­ter or worse; oth­ers come to us with as­ser­tions along these lines, and we ques­tion them, ask­ing “Why is it right?” Even when we be­lieve a thing is right (be­cause some­one told us that it is, or be­cause we word­lessly feel that it is) we may still ques­tion why it is right.

Should-ness, it seems, flows back­ward in time. This gives us one way to ques­tion why or whether a par­tic­u­lar event has the should-ness prop­erty. We can look for some con­se­quence that has the should-ness prop­erty. If so, the should-ness of the origi­nal event seems to have been plau­si­bly proven or ex­plained.

Ah, but what about the con­se­quence—why is it should? Some­one comes to you and says, “You should give me your wallet, be­cause then I’ll have your money, and I should have your money.” If, at this point, you stop ask­ing ques­tions about should-ness, you’re vuln­er­a­ble to a moral mug­ging.

So we keep ask­ing the next ques­tion. Why should we press the but­ton? To pull the string. Why should we pull the string? To flip the switch. Why should we flip the switch? To pull the child from the railroad tracks. Why pull the child from the railroad tracks? So that they live. Why should the child live?

Now there are peo­ple who, caught up in the en­thu­si­asm, go ahead and an­swer that ques­tion in the same style: for ex­am­ple, “Be­cause the child might even­tu­ally grow up and be­come a trade part­ner with you,” or “Be­cause you will gain honor in the eyes of oth­ers,” or “Be­cause the child may be­come a great sci­en­tist and help achieve the Sin­gu­lar­ity,” or some such. But even if we were to an­swer in this style, it would only beg the next ques­tion.

Even if you try to have a chain of should stretch­ing into the in­finite fu­ture—a trick I’ve yet to see any­one try to pull, by the way, though I may be only ig­no­rant of the breadths of hu­man folly—then you would sim­ply ask “Why that chain rather than some other?”

Another way that some­thing can be should, is if there’s a gen­eral rule that makes it should. If your be­lief pool starts out with the gen­eral rule “All chil­dren X: It is bet­ter for X to live than to die”, then it is quite a short step to “It is bet­ter for Stephanie to live than to die”. Ah, but why save all chil­dren? Be­cause they may all be­come trade part­ners or sci­en­tists? But then where did that gen­eral rule come from?

If should-ness only comes from should-ness—from a should-con­se­quence, or from a should-uni­ver­sal—then how does any­thing end up should in the first place?

Now hu­man be­ings have ar­gued these is­sues for thou­sands of years and maybe much longer. We do not hes­i­tate to con­tinue ar­gu­ing when we reach a ter­mi­nal value (some­thing that has a charge of should-ness in­de­pen­dently of its con­se­quences). We just go on ar­gu­ing about the uni­ver­sals.

I usu­ally take, as my archety­pal ex­am­ple, the un­do­ing of slav­ery: Some­how, slaves’ lives went from hav­ing no value to hav­ing value. Nor do I think that, back at the dawn of time, any­one was even try­ing to ar­gue that slaves were bet­ter off be­ing slaves (as it would be lat­ter ar­gued). They’d prob­a­bly have looked at you like you were crazy if you even tried. Some­how, we got from there, to here...

And some of us would even hold this up as a case of moral progress, and look at our an­ces­tors as hav­ing made a moral er­ror. Which seems easy enough to de­scribe in terms of should-ness: Our an­ces­tors thought that they should en­slave defeated en­e­mies, but they were mis­taken.

But all our philo­soph­i­cal ar­gu­ments ul­ti­mately seem to ground in state­ments that no one has both­ered to jus­tify—ex­cept per­haps to plead that they are self-ev­i­dent, or that any rea­son­able mind must surely agree, or that they are a pri­ori truths, or some such. Per­haps, then, all our moral be­liefs are as er­ro­neous as that old bit about slav­ery? Per­haps we have en­tirely mis­per­ceived the flow­ing streams of should?

So I once be­lieved was plau­si­ble; and one of the ar­gu­ments I wish I could go back and say to my­self, is, “If you know noth­ing at all about should-ness, then how do you know that the pro­ce­dure, ‘Do what­ever Em­peror Ming says’ is not the en­tirety of should-ness? Or even worse, per­haps, the pro­ce­dure, ‘Do what­ever max­i­mizes in­clu­sive ge­netic fit­ness’ or ‘Do what­ever makes you per­son­ally happy’.” The point here would have been to make my past self see that in re­ject­ing these rules, he was as­sert­ing a kind of knowl­edge—that to say, “This is not moral­ity,” he must re­veal that, de­spite him­self, he knows some­thing about moral­ity or meta-moral­ity. Other­wise, the pro­ce­dure “Do what­ever Em­peror Ming says” would seem just as plau­si­ble, as a guid­ing prin­ci­ple, as his cur­rent path of “Re­ject­ing things that seem un­jus­tified.” Un­jus­tified—ac­cord­ing to what crite­rion of jus­tifi­ca­tion? Why trust the prin­ci­ple that says that moral state­ments need to be jus­tified, if you know noth­ing at all about moral­ity?

What in­deed would dis­t­in­guish, at all, the ques­tion “What is right?” from “What is wrong?”

What is “right”, if you can’t say “good” or “de­sir­able” or “bet­ter” or “prefer­able” or “moral” or “should”? What hap­pens if you try to carry out the op­er­a­tion of re­plac­ing the sym­bol with what it stands for?

If you’re guess­ing that I’m try­ing to in­vei­gle you into let­ting me say: “Well, there are just some things that are baked into the ques­tion, when you start ask­ing ques­tions about moral­ity, rather than wakalixes or toaster ovens”, then you would be right. I’ll be mak­ing use of that later, and, yes, will ad­dress “But why should we ask that ques­tion?”

Okay, now: moral­ity-gog­gles off, re­duc­tion-gog­gles on.

Those who re­mem­ber Pos­si­bil­ity and Could-ness, or those fa­mil­iar with sim­ple search tech­niques in AI, will re­al­ize that the “should” la­bel is be­hav­ing like the in­verse of the “could” la­bel, which we pre­vi­ously an­a­lyzed in terms of “reach­a­bil­ity”. Reach­a­bil­ity spreads for­ward in time: if I could reach the state with the but­ton pressed, I could reach the state with the string pul­led; if I could reach the state with the string pul­led, I could reach the state with the switch flipped.

Where the “could” la­bel and the “should” la­bel col­lide, the al­gorithm pro­duces a plan.

Now, as I say this, I sus­pect that at least some read­ers may find them­selves fear­ing that I am about to re­duce should-ness to a mere ar­ti­fact of a way that a plan­ning sys­tem feels from in­side. Once again I urge you to check Chang­ing Your Me­taethics, if this starts to hap­pen. Re­mem­ber above all the Mo­ral Void: Even if there were no moral­ity, you could still choose to help peo­ple rather than hurt them. This, above all, holds in place what you hold pre­cious, while your be­liefs about the na­ture of moral­ity change.

I do not in­tend, with this post, to take away any­thing of value; it will all be given back be­fore the end.

Now this al­gorithm is not very so­phis­ti­cated, as AI al­gorithms go, but to ap­ply it in full gen­er­al­ity—to learned in­for­ma­tion, not just an­ces­trally en­coun­tered, ge­net­i­cally pro­grammed situ­a­tions—is a rare thing among an­i­mals. Put a food re­ward in a trans­par­ent box. Put the match­ing key, which looks unique and uniquely cor­re­sponds to that box, in an­other trans­par­ent box. Put the unique key to that box in an­other box. Do this with five boxes. Mix in an­other se­quence of five boxes that doesn’t lead to a food re­ward. Then offer a choice of two keys, one of which starts the se­quence of five boxes lead­ing to food, one of which starts the se­quence lead­ing nowhere.

Chim­panzees can learn to do this, but so far as I know, no non-pri­mate species can pull that trick.

And as smart as chim­panzees are, they are not quite as good as hu­mans at in­vent­ing plans—plans such as, for ex­am­ple, plant­ing in the spring to har­vest in the fall.

So what else are hu­mans do­ing, in the way of plan­ning?

It is a gen­eral ob­ser­va­tion that nat­u­ral se­lec­tion seems to reuse ex­ist­ing com­plex­ity, rather than cre­at­ing things from scratch, when­ever it pos­si­bly can—though not always in the same way that a hu­man en­g­ineer would. It is a func­tion of the enor­mous time re­quired for evolu­tion to cre­ate ma­chines with many in­ter­de­pen­dent parts, and the vastly shorter time re­quired to cre­ate a mu­tated copy of some­thing already evolved.

What else are hu­mans do­ing? Quite a bit, and some of it I don’t un­der­stand—there are plans hu­mans make, that no mod­ern-day AI can.

But one of the things we are do­ing, is rea­son­ing about “right-ness” the same way we would rea­son about any other ob­serv­able prop­erty.

Are an­i­mals with bright col­ors of­ten poi­sonous? Does the deli­cious nid-nut grow only in the spring? Is it usu­ally a good idea to take with a wa­ter­skin on long hunts?

It seems that Martha and Fred have an obli­ga­tion to take care of their child, and Jane and Bob are obli­gated to take care of their child, and Su­san and Wil­son have a duty to care for their child. Could it be that par­ents in gen­eral must take care of their chil­dren?

By rep­re­sent­ing right-ness as an at­tribute of ob­jects, you can re­cruit a whole pre­vi­ously evolved sys­tem that rea­sons about the at­tributes of ob­jects. You can save quite a lot of plan­ning time, if you de­cide (based on ex­pe­rience) that in gen­eral it is a good idea to take a wa­ter­skin on hunts, from which it fol­lows that it must be a good idea to take a wa­ter­skin on hunt #342.

Is this damnable for a Mind Pro­jec­tion Fal­lacy—treat­ing prop­er­ties of the mind as if they were out there in the world?

Depends on how you look at it.

This busi­ness of, “It’s been a good idea to take wa­ter­skins on the last three hunts, maybe it’s a good idea in gen­eral, if so it’s a good idea to take a wa­ter­skin on this hunt”, does seem to work.

Let’s say that your mind, faced with any countable set of ob­jects, au­to­mat­i­cally and per­cep­tu­ally tagged them with their re­main­der mod­ulo 5. If you saw a group of 17 ob­jects, for ex­am­ple, they would look re­main­der-2-ish. Though, if you didn’t have any no­tion of what your neu­rons were do­ing, and per­haps no no­tion of mod­ulo ar­ith­metic, you would only see that the group of 17 ob­jects had the same re­main­der-ness as a group of 2 ob­jects. You might not even know how to count—your brain do­ing the whole thing au­to­mat­i­cally, sub­con­sciously and neu­rally—in which case you would just have five differ­ent words for the re­main­der-ness at­tributes that we would call 0, 1, 2, 3, and 4.

If you look out upon the world you see, and guess that re­main­der-ness is a sep­a­rate and ad­di­tional at­tribute of things—like the at­tribute of hav­ing an elec­tric charge—or like a tiny lit­tle XML tag hang­ing off of things—then you will be wrong. But this does not mean it is non­sense to talk about re­main­der-ness, or that you must au­to­mat­i­cally com­mit the Mind Pro­jec­tion Fal­lacy in do­ing so. So long as you’ve got a well-defined way to com­pute a prop­erty, it can have a well-defined out­put and hence an em­piri­cal truth con­di­tion.

If you’re look­ing at 17 ob­jects, then their re­main­der-ness is, in­deed and truly, 2, and not 0, 3, 4, or 1. If I tell you, “Those red things you told me to look at are re­main­der-2-ish”, you have in­deed been told a falsifi­able and em­piri­cal prop­erty of those red things. It is just not a sep­a­rate, ad­di­tional, phys­i­cally ex­is­tent at­tribute.

And as for rea­son­ing about de­rived prop­er­ties, and which other in­her­ent or de­rived prop­er­ties they cor­re­late to—I don’t see any­thing in­her­ently fal­la­cious about that.

One may no­tice, for ex­am­ple, that things which are 7 mod­ulo 10 are of­ten also 2 mod­ulo 5. Em­piri­cal ob­ser­va­tions of this sort play a large role in math­e­mat­ics, sug­gest­ing the­o­rems to prove. (See Polya’s How To Solve It.)

In­deed, vir­tu­ally all the ex­pe­rience we have, is de­rived by com­pli­cated neu­ral com­pu­ta­tions from the raw phys­i­cal events im­p­ing­ing on our sense or­gans. By the time you see any­thing, it has been ex­ten­sively pro­cessed by the retina, lat­eral genicu­late nu­cleus, vi­sual cor­tex, pari­etal cor­tex, and tem­po­ral cor­tex, into a very com­plex sort of de­rived com­pu­ta­tional prop­erty.

If you thought of a prop­erty like red­ness as re­sid­ing strictly in an ap­ple, you would be com­mit­ting the Mind Pro­jec­tion Fal­lacy. The ap­ple’s sur­face has a re­flec­tance which sends out a mix­ture of wave­lengths that im­p­inge on your retina and are pro­cessed with re­spect to am­bi­ent light to ex­tract a sum­mary color of red… But if you tell me that the ap­ple is red, rather than green, and make no claims as to whether this is an on­tolog­i­cally fun­da­men­tal phys­i­cal at­tribute of the ap­ple, then I am quite happy to agree with you.

So as long as there is a sta­ble com­pu­ta­tion in­volved, or a sta­ble pro­cess—even if you can’t con­sciously ver­bal­ize the speci­fi­ca­tion—it of­ten makes a great deal of sense to talk about prop­er­ties that are not fun­da­men­tal. And rea­son about them, and re­mem­ber where they have been found in the past, and guess where they will be found next.

(In ret­ro­spect, that should have been a sep­a­rate post in the Re­duc­tion­ism se­quence. “Derived Prop­er­ties”, or “Com­pu­ta­tional Prop­er­ties” maybe. Oh, well; I promised you moral­ity this day, and this day moral­ity you shall have.)

Now let’s say we want to make a lit­tle ma­chine, one that will save the lives of chil­dren. (This en­ables us to save more chil­dren than we could do with­out a ma­chine, just like you can move more dirt with a shovel than by hand.) The ma­chine will be a plan­ning ma­chine, and it will rea­son about events that may or may not have the prop­erty, leads-to-child-liv­ing.

A sim­ple plan­ning ma­chine would just have a pre-made model of the en­vi­ron­men­tal pro­cess. It would search for­ward from its ac­tions, ap­ply­ing a la­bel that we might call “reach­able-from-ac­tion-ness”, but which might as well say “Xy­bliz” in­ter­nally for all that it mat­ters to the pro­gram. And it would search back­ward from sce­nar­ios, situ­a­tions, in which the child lived, la­bel­ing these “leads-to-child-liv­ing”. If situ­a­tion X leads to situ­a­tion Y, and Y has the la­bel “leads-to-child-liv­ing”—which might just be a lit­tle flag bit, for all the differ­ence it would make—then X will in­herit the flag from Y. When the two la­bels meet in the mid­dle, the leads-to-child-liv­ing flag will quickly trace down the stored path of reach­a­bil­ity, un­til fi­nally some par­tic­u­lar se­quence of ac­tions ends up la­beled “leads-to-child-liv­ing”. Then the ma­chine au­to­mat­i­cally ex­e­cutes those ac­tions—that’s just what the ma­chine does.

Now this ma­chine is not com­pli­cated enough to feel ex­is­ten­tial angst. It is not com­pli­cated enough to com­mit the Mind Pro­jec­tion Fal­lacy. It is not, in fact, com­pli­cated enough to rea­son ab­stractly about the prop­erty “leads-to-child-liv­ing-ness”. The ma­chine—as speci­fied so far—does not no­tice if the ac­tion “jump in the air” turns out to always have this prop­erty, or never have this prop­erty. If “jump in the air” always led to situ­a­tions in which the child lived, this could greatly sim­plify fu­ture plan­ning—but only if the ma­chine were so­phis­ti­cated enough to no­tice this fact and use it.

If it is a fact that “jump in the air” “leads-to-child-liv­ing-ness”, this fact is com­posed of em­piri­cal truth and log­i­cal truth. It is an em­piri­cal truth that if the world is such that if you perform the (ideal ab­stract) al­gorithm “trace back from situ­a­tions where the child lives”, then it will be a log­i­cal truth about the out­put of this (ideal ab­stract) al­gorithm that it la­bels the “jump in the air” ac­tion.

(You can­not always define this fact in en­tirely em­piri­cal terms, by look­ing for the phys­i­cal real-world co­in­ci­dence of jump­ing and child sur­vival. It might be that “stomp left” also always saves the child, and the ma­chine in fact stomps left. In which case the fact that jump­ing in the air would have saved the child, is a coun­ter­fac­tual ex­trap­o­la­tion.)

Okay, now we’re ready to bridge the lev­els.

As you must surely have guessed by now, this should-ness stuff is how the hu­man de­ci­sion al­gorithm feels from in­side. It is not an ex­tra, phys­i­cal, on­tolog­i­cally fun­da­men­tal at­tribute hang­ing off of events like a tiny lit­tle XML tag.

But it is a moral ques­tion what we should do about that—how we should re­act to it.

To adopt an at­ti­tude of com­plete nihilism, be­cause we wanted those tiny lit­tle XML tags, and they’re not phys­i­cally there, strikes me as the wrong move. It is like sup­pos­ing that the ab­sence of an XML tag, equates to the XML tag be­ing there, say­ing in its tiny brack­ets what value we should at­tach, and hav­ing value zero. And then this value zero, in turn, equat­ing to a moral im­per­a­tive to wear black, feel awful, write gloomy po­etry, be­tray friends, and com­mit suicide.


So what would I say in­stead?

The force be­hind my an­swer is con­tained in The Mo­ral Void and The Gift We Give To To­mor­row. I would try to save lives “even if there were no moral­ity”, as it were.

And it seems like an awful shame to—af­ter so many mil­lions and hun­dreds of mil­lions of years of evolu­tion—af­ter the moral mir­a­cle of so much cut­throat ge­netic com­pe­ti­tion pro­duc­ing in­tel­li­gent minds that love, and hope, and ap­pre­ci­ate beauty, and cre­ate beauty—af­ter com­ing so far, to throw away the Gift of moral­ity, just be­cause our brain hap­pened to rep­re­sent moral­ity in such fash­ion as to po­ten­tially mis­lead us when we re­flect on the na­ture of moral­ity.

This lit­tle ac­ci­dent of the Gift doesn’t seem like a good rea­son to throw away the Gift; it cer­tainly isn’t a in­escapable log­i­cal jus­tifi­ca­tion for wear­ing black.

Why not keep the Gift, but ad­just the way we re­flect on it?

So here’s my metaethics:

I ear­lier asked,

What is “right”, if you can’t say “good” or “de­sir­able” or “bet­ter” or “prefer­able” or “moral” or “should”? What hap­pens if you try to carry out the op­er­a­tion of re­plac­ing the sym­bol with what it stands for?

I an­swer that if you try to re­place the sym­bol “should” with what it stands for, you end up with quite a large sen­tence.

For the much sim­pler save-life ma­chine, the “should” la­bel stands for leads-to-child-liv­ing-ness.

For a hu­man this is a much huger blob of a com­pu­ta­tion that looks like, “Did ev­ery­one sur­vive? How many peo­ple are happy? Are peo­ple in con­trol of their own lives? …” Hu­mans have com­plex emo­tions, have many val­ues—the thou­sand shards of de­sire, the god­shat­ter of nat­u­ral se­lec­tion. I would say, by the way, that the huge blob of a com­pu­ta­tion is not just my pre­sent ter­mi­nal val­ues (which I don’t re­ally have—I am not a con­sis­tent ex­pected util­ity max­i­miz­ers); the huge blob of a com­pu­ta­tion in­cludes the speci­fi­ca­tion of those moral ar­gu­ments, those jus­tifi­ca­tions, that would sway me if I heard them. So that I can re­gard my pre­sent val­ues, as an ap­prox­i­ma­tion to the ideal moral­ity that I would have if I heard all the ar­gu­ments, to what­ever ex­tent such an ex­trap­o­la­tion is co­her­ent.

No one can write down their big com­pu­ta­tion; it is not just too large, it is also un­known to its user. No more could you print out a list­ing of the neu­rons in your brain. You never men­tion your big com­pu­ta­tion—you only use it, ev­ery hour of ev­ery day.

Now why might one iden­tify this enor­mous ab­stract com­pu­ta­tion, with what-is-right?

If you iden­tify right­ness with this huge com­pu­ta­tional prop­erty, then moral judg­ments are sub­junc­tively ob­jec­tive (like math), sub­jec­tively ob­jec­tive (like prob­a­bil­ity), and ca­pa­ble of be­ing true (like coun­ter­fac­tu­als).

You will find your­self say­ing, “If I wanted to kill some­one—even if I thought it was right to kill some­one—that wouldn’t make it right.” Why? Be­cause what is right is a huge com­pu­ta­tional prop­erty—an ab­stract com­pu­ta­tion—not tied to the state of any­one’s brain, in­clud­ing your own brain.

This dis­tinc­tion was in­tro­duced ear­lier in 2-Place and 1-Place Words. We can treat the word “sexy” as a 2-place func­tion that goes out and hoovers up some­one’s sense of sex­i­ness, and then eats an ob­ject of ad­mira­tion. Or we can treat the word “sexy” as mean­ing a 1-place func­tion, a par­tic­u­lar sense of sex­i­ness, like Sex­i­ness_20934, that only ac­cepts one ar­gu­ment, an ob­ject of ad­mira­tion.

Here we are treat­ing moral­ity as a 1-place func­tion. It does not ac­cept a per­son as an ar­gu­ment, spit out what­ever cog­ni­tive al­gorithm they use to choose be­tween ac­tions, and then ap­ply that al­gorithm to the situ­a­tion at hand. When I say right, I mean a cer­tain par­tic­u­lar 1-place func­tion that just asks, “Did the child live? Did any­one else get kil­led? Are peo­ple happy? Are they in con­trol of their own lives? Has jus­tice been served?” … and so on through many, many other el­e­ments of right­ness. (And per­haps those ar­gu­ments that might per­suade me oth­er­wise, which I have not heard.)

Hence the no­tion, “Re­place the sym­bol with what it stands for.”

Since what’s right is a 1-place func­tion, if I sub­junc­tively imag­ine a world in which some­one has slipped me a pill that makes me want to kill peo­ple, then, in this sub­junc­tive world, it is not right to kill peo­ple. That’s not merely be­cause I’m judg­ing with my cur­rent brain. It’s be­cause when I say right, I am refer­ring to a 1-place func­tion. Right­ness doesn’t go out and hoover up the cur­rent state of my brain, in this sub­junc­tive world, be­fore pro­duc­ing the judg­ment “Oh, wait, it’s now okay to kill peo­ple.” When I say right, I don’t mean “that which my fu­ture self wants”, I mean the func­tion that looks at a situ­a­tion and asks, “Did any­one get kil­led? Are peo­ple happy? Are they in con­trol of their own lives? …”

And once you’ve defined a par­tic­u­lar ab­stract com­pu­ta­tion that says what is right—or even if you haven’t defined it, and it’s com­puted in some part of your brain you can’t perfectly print out, but the com­pu­ta­tion is sta­ble—more or less—then as with any other de­rived prop­erty, it makes sense to speak of a moral judg­ment be­ing true. If I say that to­day was a good day, you’ve learned some­thing em­piri­cal and falsifi­able about my day—if it turns out that ac­tu­ally my grand­mother died, you will sus­pect that I was origi­nally ly­ing.

The ap­par­ent ob­jec­tivity of moral­ity has just been ex­plained—and not ex­plained away. For in­deed, if some­one slipped me a pill that made me want to kill peo­ple, nonethe­less, it would not be right to kill peo­ple. Per­haps I would ac­tu­ally kill peo­ple, in that situ­a­tion—but that is be­cause some­thing other than moral­ity would be con­trol­ling my ac­tions.

Mo­ral­ity is not just sub­junc­tively ob­jec­tive, but sub­jec­tively ob­jec­tive. I ex­pe­rience it as some­thing I can­not change. Even af­ter I know that it’s my­self who com­putes this 1-place func­tion, and not a rock some­where—even af­ter I know that I will not find any star or moun­tain that com­putes this func­tion, that only upon me is it writ­ten—even so, I find that I wish to save lives, and that even if I could change this by an act of will, I would not choose to do so. I do not wish to re­ject joy, or beauty, or free­dom. What else would I do in­stead? I do not wish to re­ject the Gift that nat­u­ral se­lec­tion ac­ci­den­tally bar­fed into me. This is the prin­ci­ple of The Mo­ral Void and The Gift We Give To To­mor­row.

Our ori­gins may seem unattrac­tive, our brains un­trust­wor­thy.

But love has to en­ter the uni­verse some­how, start­ing from non-love, or love can­not en­ter time.

And if our brains are un­trust­wor­thy, it is only our own brains that say so. Do you some­times think that hu­man be­ings are not very nice? Then it is you, a hu­man be­ing, who says so. It is you, a hu­man be­ing, who judges that hu­man be­ings could do bet­ter. You will not find such writ­ten upon the stars or the moun­tains: they are not minds, they can­not think.

In this, of course, we find a jus­tifi­ca­tional strange loop through the meta-level. Which is un­avoid­able so far as I can see—you can’t ar­gue moral­ity, or any kind of goal op­ti­miza­tion, into a rock. But note the ex­act struc­ture of this strange loop: there is no gen­eral moral prin­ci­ple which says that you should do what evolu­tion pro­grammed you to do. There is, in­deed, no gen­eral prin­ci­ple to trust your moral in­tu­itions! You can find a moral in­tu­ition within your­self, de­scribe it—quote it—con­sider it de­liber­ately and in the full light of your en­tire moral­ity, and re­ject it, on grounds of other ar­gu­ments. What counts as an ar­gu­ment is also built into the right­ness-func­tion.

Just as, in the strange loop of ra­tio­nal­ity, there is no gen­eral prin­ci­ple in ra­tio­nal­ity to trust your brain, or to be­lieve what evolu­tion pro­grammed you to be­lieve—but in­deed, when you ask which parts of your brain you need to rebel against, you do so us­ing your cur­rent brain. When you ask whether the uni­verse is sim­ple, you can con­sider the sim­ple hy­poth­e­sis that the uni­verse’s ap­par­ent sim­plic­ity is ex­plained by its ac­tual sim­plic­ity.

Rather than try­ing to un­wind our­selves into rocks, I pro­posed that we should use the full strength of our cur­rent ra­tio­nal­ity, in re­flect­ing upon our­selves—that no part of our­selves be im­mune from ex­am­i­na­tion, and that we use all of our­selves that we cur­rently be­lieve in to ex­am­ine it.

You would do the same thing with moral­ity; if you con­sider that a part of your­self might be con­sid­ered harm­ful, then use your best cur­rent guess at what is right, your full moral strength, to do the con­sid­er­ing. Why should we want to un­wind our­selves to a rock? Why should we do less than our best, when re­flect­ing? You can’t un­wind past Oc­cam’s Ra­zor, modus po­nens, or moral­ity and it’s not clear why you should try.

For any part of right­ness, you can always imag­ine an­other part that over­rides it—it would not be right to drag the child from the train tracks, if this re­sulted in ev­ery­one on Earth be­com­ing un­able to love—or so I would judge. For ev­ery part of right­ness you ex­am­ine, you will find that it can­not be the sole and perfect and only crite­rion of right­ness. This may lead to the in­cor­rect in­fer­ence that there is some­thing be­yond, some perfect and only crite­rion from which all the oth­ers are de­rived—but that does not fol­low. The whole is the sum of the parts. We ran into an analo­gous situ­a­tion with free will, where no part of our­selves seems perfectly de­ci­sive.

The clas­sic dilemma for those who would trust their moral in­tu­itions, I be­lieve, is the one who says: “In­ter­ra­cial mar­riage is re­pug­nant—it dis­gusts me—and that is my moral in­tu­ition!” I re­ply, “There is no gen­eral rule to obey your in­tu­itions. You just men­tioned in­tu­itions, rather than us­ing them. Very few peo­ple have le­gi­t­i­mate cause to men­tion in­tu­itions—Friendly AI pro­gram­mers, for ex­am­ple, delv­ing into the cog­ni­tive sci­ence of things, have a le­gi­t­i­mate rea­son to men­tion them. Every­one else just has or­di­nary moral ar­gu­ments, in which they use their in­tu­itions, for ex­am­ple, by say­ing, ‘An in­ter­ra­cial mar­riage doesn’t hurt any­one, if both par­ties con­sent’. I do not say, ‘And I have an in­tu­ition that any­thing con­sent­ing adults do is right, and all in­tu­itions must be obeyed, there­fore I win.’ I just offer up that ar­gu­ment, and any oth­ers I can think of, to weigh in the bal­ance.”

In­deed, evolu­tion that made us can­not be trusted—so there is no gen­eral prin­ci­ple to trust it! Right­ness is not defined in terms of au­to­matic cor­re­spon­dence to any pos­si­ble de­ci­sion we ac­tu­ally make—so there’s no gen­eral prin­ci­ple that says you’re in­fal­lible! Just do what is, ahem, right—to the best of your abil­ity to weigh the ar­gu­ments you have heard, and pon­der the ar­gu­ments you may not have heard.

If you were hop­ing to have a perfectly trust­wor­thy sys­tem, or to have been cre­ated in cor­re­spon­dence with a perfectly trust­wor­thy moral­ity—well, I can’t give that back to you; but even most re­li­gions don’t try that one. Even most re­li­gions have the hu­man psy­chol­ogy con­tain­ing el­e­ments of sin, and even most re­li­gions don’t ac­tu­ally give you an effec­tively ex­e­cutable and perfect pro­ce­dure, though they may tell you “Con­sult the Bible! It always works!”

If you hoped to find a source of moral­ity out­side hu­man­ity—well, I can’t give that back, but I can ask once again: Why would you even want that? And what good would it do? Even if there were some great light in the sky—some­thing that could tell us, “Sorry, hap­piness is bad for you, pain is bet­ter, now get out there and kill some ba­bies!“—it would still be your own de­ci­sion to fol­low it. You can­not evade re­spon­si­bil­ity.

There isn’t enough mys­tery left to jus­tify rea­son­able doubt as to whether the causal ori­gin of moral­ity is some­thing out­side hu­man­ity. We have evolu­tion­ary psy­chol­ogy. We know where moral­ity came from. We pretty much know how it works, in broad out­line at least. We know there are no lit­tle XML value tags on elec­trons (and in­deed, even if you found them, why should you pay at­ten­tion to what is writ­ten there?)

If you hoped that moral­ity would be uni­ver­sal­iz­able—sorry, that one I re­ally can’t give back. Well, un­less we’re just talk­ing about hu­mans. Between neu­rolog­i­cally in­tact hu­mans, there is in­deed much cause to hope for over­lap and co­her­ence; and a great and rea­son­able doubt as to whether any pre­sent dis­agree­ment is re­ally un­re­solv­able, even it seems to be about “val­ues”. The ob­vi­ous rea­son for hope is the psy­cholog­i­cal unity of hu­mankind, and the in­tu­itions of sym­me­try, uni­ver­sal­iz­abil­ity, and sim­plic­ity that we ex­e­cute in the course of our moral ar­gu­ments. (In ret­ro­spect, I should have done a post on In­ter­per­sonal Mo­ral­ity be­fore this...)

If I tell you that three peo­ple have found a pie and are ar­gu­ing about how to di­vide it up, the thought “Give one-third of the pie to each” is bound to oc­cur to you—and if the three peo­ple are hu­mans, it’s bound to oc­cur to them, too. If one of them is a psy­chopath and in­sists on get­ting the whole pie, though, there may be noth­ing for it but to say: “Sorry, fair­ness is not ‘what ev­ery­one thinks is fair’, fair­ness is ev­ery­one get­ting a third of the pie”. You might be able to re­solve the re­main­ing dis­agree­ment by poli­tics and game the­ory, short of vi­o­lence—but that is not the same as com­ing to agree­ment on val­ues. (Maybe you could per­suade the psy­chopath that tak­ing a pill to be more hu­man, if one were available, would make them hap­pier? Would you be jus­tified in forc­ing them to swal­low the pill? Th­ese get us into stranger wa­ters that de­serve a sep­a­rate post.)

If I define right­ness to in­clude the space of ar­gu­ments that move me, then when you and I ar­gue about what is right, we are ar­gu­ing our ap­prox­i­ma­tions to what we would come to be­lieve if we knew all em­piri­cal facts and had a mil­lion years to think about it—and that might be a lot closer than the pre­sent and heated ar­gu­ment. Or it might not. This gets into the no­tion of ‘con­stru­ing an ex­trap­o­lated vo­li­tion’ which would be, again, a sep­a­rate post.

But if you were step­ping out­side the hu­man and hop­ing for moral ar­gu­ments that would per­suade any pos­si­ble mind, even a mind that just wanted to max­i­mize the num­ber of pa­per­clips in the uni­verse, then sorry—the space of pos­si­ble mind de­signs is too large to per­mit uni­ver­sally com­pel­ling ar­gu­ments. You are bet­ter off treat­ing your in­tu­ition that your moral ar­gu­ments ought to per­suade oth­ers, as ap­ply­ing only to other hu­mans who are more or less neu­rolog­i­cally in­tact. Try­ing it on hu­man psy­chopaths would be dan­ger­ous, yet per­haps pos­si­ble. But a pa­per­clip max­i­mizer is just not the sort of mind that would be moved by a moral ar­gu­ment. (This will definitely be a sep­a­rate post.)

Once, in my wild and reck­less youth, I tried du­tifully—I thought it was my duty—to be ready and will­ing to fol­low the dic­tates of a great light in the sky, an ex­ter­nal ob­jec­tive moral­ity, when I dis­cov­ered it. I ques­tioned ev­ery­thing, even al­tru­ism to­ward hu­man lives, even the value of hap­piness. Fi­nally I re­al­ized that there was no foun­da­tion but hu­man­ity—no ev­i­dence point­ing to even a rea­son­able doubt that there was any­thing else—and in­deed I shouldn’t even want to hope for any­thing else—and in­deed would have no moral cause to fol­low the dic­tates of a light in the sky, even if I found one.

I didn’t get back im­me­di­ately all the pieces of my­self that I had tried to de­p­re­cate—it took time for the re­al­iza­tion “There is noth­ing else” to sink in. The no­tion that hu­man­ity could just… you know… live and have fun… seemed much too good to be true, so I mis­trusted it. But even­tu­ally, it sank in that there re­ally was noth­ing else to take the place of beauty. And then I got it back.

So you see, it all re­ally does add up to moral nor­mal­ity, very ex­actly in fact. You go on with the same morals as be­fore, and the same moral ar­gu­ments as be­fore. There is no sud­den Grand Over­lord Pro­ce­dure to which you can ap­peal to get a perfectly trust­wor­thy an­swer. You don’t know, can­not print out, the great right­ness-func­tion; and even if you could, you would not have enough com­pu­ta­tional power to search the en­tire speci­fied space of ar­gu­ments that might move you. You will just have to ar­gue it out.

I sus­pect that a fair num­ber of those who pro­pound metaethics do so in or­der to have it add up to some new and un­usual moral—else why would they bother? In my case, I bother be­cause I am a Friendly AI pro­gram­mer and I have to make a phys­i­cal sys­tem out­side my­self do what’s right; for which pur­pose metaethics be­comes very im­por­tant in­deed. But for the most part, the effect of my proffered metaethic is three­fold:

  • Any­one wor­ried that re­duc­tion­ism drains the mean­ing from ex­is­tence can stop wor­ry­ing;

  • Any­one who was re­ject­ing parts of their hu­man ex­is­tence based on strange metaethics—i.e., “Why should I care about oth­ers, if that doesn’t help me max­i­mize my in­clu­sive ge­netic fit­ness?“—can wel­come back all the parts of them­selves that they once ex­iled.

  • You can stop ar­gu­ing about metaethics, and go back to what­ever or­di­nary moral ar­gu­ment you were hav­ing be­fore then. This knowl­edge will help you avoid metaeth­i­cal mis­takes that mess up moral ar­gu­ments, but you can’t ac­tu­ally use it to set­tle de­bates un­less you can build a Friendly AI.

And, oh yes—why is it right to save a child’s life?

Well… you could ask “Is this event that just hap­pened, right?” and find that the child had sur­vived, in which case you would have dis­cov­ered the nonob­vi­ous em­piri­cal fact about the world, that it had come out right.

Or you could start out already know­ing a com­pli­cated state of the world, but still have to ap­ply the right­ness-func­tion to it in a non­triv­ial way—one in­volv­ing a com­pli­cated moral ar­gu­ment, or ex­trap­o­lat­ing con­se­quences into the fu­ture—in which case you would learn the nonob­vi­ous log­i­cal /​ com­pu­ta­tional fact that right­ness, ap­plied to this situ­a­tion, yielded thumbs-up.

In both these cases, there are nonob­vi­ous facts to learn, which seem to ex­plain why what just hap­pened is right.

But if you ask “Why is it good to be happy?” and then re­place the sym­bol ‘good’ with what it stands for, you’ll end up with a ques­tion like “Why does hap­piness match {hap­piness + sur­vival + jus­tice + in­di­vi­d­u­al­ity + …}?” This gets com­puted so fast, that it scarcely seems like there’s any­thing there to be ex­plained. It’s like ask­ing “Why does 4 = 4?” in­stead of “Why does 2 + 2 = 4?”

Now, I bet that feels quite a bit like what hap­pens when I ask you: “Why is hap­piness good?”


And that’s also my an­swer to Moore’s Open Ques­tion. Why is this big func­tion I’m talk­ing about, right? Be­cause when I say “that big func­tion”, and you say “right”, we are derefer­enc­ing two differ­ent poin­t­ers to the same un­ver­bal­iz­able ab­stract com­pu­ta­tion. I mean, that big func­tion I’m talk­ing about, hap­pens to be the same thing that la­bels things right in your own brain. You might re­flect on the pieces of the quo­ta­tion of the big func­tion, but you would start out by us­ing your sense of right-ness to do it. If you had the perfect em­piri­cal knowl­edge to taboo both “that big func­tion” and “right”, sub­sti­tute what the poin­t­ers stood for, and write out the full enor­mity of the re­sult­ing sen­tence, it would come out as… sorry, I can’t re­sist this one… A=A.

Part of The Me­taethics Sequence

Next post: “In­ter­per­sonal Mo­ral­ity

Pre­vi­ous post: “Set­ting Up Me­taethics