The Mean­ing of Right

Continu­ation of: Changing Your Metaethics, Set­ting Up Metaethic­s
Fol­lowup to: Does Your Mor­al­ity Care What You Think?, The Moral Void, Prob­ab­il­ity is Sub­ject­ively Ob­ject­ive, Could Anything Be Right?, The Gift We Give To To­mor­row, Re­belling Within Nature, Where Re­curs­ive Jus­ti­fic­a­tion Hits Bot­tom, …

(The cul­min­a­tion of a long series of Over­com­ing Bias posts; if you start here, I ac­cept no re­spons­ib­il­ity for any res­ult­ing con­fu­sion, mis­un­der­stand­ing, or un­ne­ces­sary angst.)

What is mor­al­ity? What does the word “should”, mean? The many pieces are in place: This ques­tion I shall now dis­solve.

The key—as it has al­ways been, in my ex­per­i­ence so far—is to un­der­stand how a cer­tain cog­nit­ive al­gorithm feels from in­side. Stand­ard pro­ced­ure for right­ing a wrong ques­tion: If you don’t know what right-ness is, then take a step be­neath and ask how your brain la­bels things “right”.

It is not the same ques­tion—it has no moral as­pects to it, be­ing strictly a mat­ter of fact and cog­nit­ive sci­ence. But it is an il­lu­min­at­ing ques­tion. Once we know how our brain la­bels things “right”, per­haps we shall find it easier, af­ter­ward, to ask what is really and truly right.

But with that said—the easi­est way to be­gin in­vest­ig­at­ing that ques­tion, will be to jump back up to the level of mor­al­ity and ask what seems right. And if that seems like too much re­cur­sion, get used to it—the other 90% of the work lies in hand­ling re­cur­sion prop­erly.

(Should you find your grasp on mean­ing­ful­ness waver­ing, at any time fol­low­ing, check Changing Your Metaethics for the ap­pro­pri­ate pro­phy­lactic.)

So! In or­der to in­vest­ig­ate how the brain la­bels things “right”, we are go­ing to start out by talk­ing about what is right. That is, we’ll start out wear­ing our mor­al­ity-goggles, in which we con­sider mor­al­ity-as-mor­al­ity and talk about moral ques­tions dir­ectly. As op­posed to wear­ing our re­duc­tion-goggles, in which we talk about cog­nit­ive al­gorithms and mere phys­ics. Rig­or­ously dis­tin­guish­ing between these two views is the first step to­ward mat­ing them to­gether.

As a first step, I of­fer this ob­ser­va­tion, on the level of mor­al­ity-as-mor­al­ity: Right­ness is con­ta­gious back­ward in time.

Sup­pose there is a switch, cur­rently set to OFF, and it is mor­ally de­sir­able for this switch to be flipped to ON. Per­haps the switch con­trols the emer­gency halt on a train bear­ing down on a child strapped to the rail­road tracks, this be­ing my ca­non­ical ex­ample. If this is the case, then, ceteris paribus and pre­sum­ing the ab­sence of ex­cep­tional con­di­tions or fur­ther con­sequences that were not ex­pli­citly spe­cified, we may con­sider it right that this switch should be flipped.

If it is right to flip the switch, then it is right to pull a string that flips the switch. If it is good to pull a string that flips the switch, it is right and proper to press a but­ton that pulls the string: Push­ing the but­ton seems to have more should-ness than not push­ing it.

It seems that—all else be­ing equal, and as­sum­ing no other con­sequences or ex­cep­tional con­di­tions which were not spe­cified—value flows back­ward along ar­rows of caus­al­ity.

Even in de­ont­o­lo­gical mor­al­it­ies, if you’re ob­lig­ated to save the child on the tracks, then you’re ob­lig­ated to press the but­ton. Only very prim­it­ive AI sys­tems have mo­tor out­puts con­trolled by strictly local rules that don’t model the fu­ture at all. Duty-based or vir­tue-based eth­ics are only slightly less con­sequen­tial­ist than con­sequen­tial­ism. It’s hard to say whether mov­ing your arm left or right is more vir­tu­ous without talk­ing about what hap­pens next.

Among my read­ers, there may be some who presently as­sert—though I hope to per­suade them oth­er­wise—that the life of a child is of no value to them. If so, they may sub­sti­tute any­thing else that they prefer, at the end of the switch, and ask if they should press the but­ton.

But I also sus­pect that, among my read­ers, there are some who won­der if the true mor­al­ity might be some­thing quite dif­fer­ent from what is presently be­lieved among the hu­man kind. They may find it ima­gin­able—plaus­ible?—that hu­man life is of no value, or neg­at­ive value. They may won­der if the good­ness of hu­man hap­pi­ness, is as much a self-serving de­lu­sion as the justice of slavery.

I my­self was once numbered among these skep­tics, be­cause I was al­ways very sus­pi­cious of any­thing that looked self-serving.

Now here’s a little ques­tion I never thought to ask, dur­ing those years when I thought I knew noth­ing about mor­al­ity:

Could make sense to have a mor­al­ity in which, if we should save the child from the train tracks, then we should not flip the switch, should pull the string, and should not push the but­ton, so that, fi­nally, we do not push the but­ton?

Or per­haps someone says that it is bet­ter to save the child, than to not save them; but doesn’t see why any­one would think this im­plies it is bet­ter to press the but­ton than not press it. (Note the re­semb­lance to the Tor­toise who denies modus pon­ens.)

It seems ima­gin­able, to at least some people, that en­tirely dif­fer­ent things could be should. It didn’t seem nearly so ima­gin­able, at least to me, that should-ness could fail to flow back­ward in time. When I was try­ing to ques­tion everything else, that thought simply did not oc­cur to me.

Can you ques­tion it? Should you?

Every now and then, in the course of hu­man ex­ist­ence, we ques­tion what should be done and what is right to do, what is bet­ter or worse; oth­ers come to us with as­ser­tions along these lines, and we ques­tion them, ask­ing “Why is it right?” Even when we be­lieve a thing is right (be­cause someone told us that it is, or be­cause we word­lessly feel that it is) we may still ques­tion why it is right.

Should-ness, it seems, flows back­ward in time. This gives us one way to ques­tion why or whether a par­tic­u­lar event has the should-ness prop­erty. We can look for some con­sequence that has the should-ness prop­erty. If so, the should-ness of the ori­ginal event seems to have been plaus­ibly proven or ex­plained.

Ah, but what about the con­sequence—why is it should? Someone comes to you and says, “You should give me your wal­let, be­cause then I’ll have your money, and I should have your money.” If, at this point, you stop ask­ing ques­tions about should-ness, you’re vul­ner­able to a moral mug­ging.

So we keep ask­ing the next ques­tion. Why should we press the but­ton? To pull the string. Why should we pull the string? To flip the switch. Why should we flip the switch? To pull the child from the rail­road tracks. Why pull the child from the rail­road tracks? So that they live. Why should the child live?

Now there are people who, caught up in the en­thu­si­asm, go ahead and an­swer that ques­tion in the same style: for ex­ample, “Be­cause the child might even­tu­ally grow up and be­come a trade part­ner with you,” or “Be­cause you will gain honor in the eyes of oth­ers,” or “Be­cause the child may be­come a great sci­ent­ist and help achieve the Sin­gu­lar­ity,” or some such. But even if we were to an­swer in this style, it would only beg the next ques­tion.

Even if you try to have a chain of should stretch­ing into the in­fin­ite fu­ture—a trick I’ve yet to see any­one try to pull, by the way, though I may be only ig­nor­ant of the breadths of hu­man folly—then you would simply ask “Why that chain rather than some other?”

Another way that some­thing can be should, is if there’s a gen­eral rule that makes it should. If your be­lief pool starts out with the gen­eral rule “All chil­dren X: It is bet­ter for X to live than to die”, then it is quite a short step to “It is bet­ter for Stephanie to live than to die”. Ah, but why save all chil­dren? Be­cause they may all be­come trade part­ners or sci­ent­ists? But then where did that gen­eral rule come from?

If should-ness only comes from should-ness—from a should-con­sequence, or from a should-uni­ver­sal—then how does any­thing end up should in the first place?

Now hu­man be­ings have ar­gued these is­sues for thou­sands of years and maybe much longer. We do not hes­it­ate to con­tinue ar­guing when we reach a ter­minal value (some­thing that has a charge of should-ness in­de­pend­ently of its con­sequences). We just go on ar­guing about the uni­ver­sals.

I usu­ally take, as my ar­chetypal ex­ample, the un­do­ing of slavery: Some­how, slaves’ lives went from hav­ing no value to hav­ing value. Nor do I think that, back at the dawn of time, any­one was even try­ing to ar­gue that slaves were bet­ter off be­ing slaves (as it would be lat­ter ar­gued). They’d prob­ably have looked at you like you were crazy if you even tried. Some­how, we got from there, to here...

And some of us would even hold this up as a case of moral pro­gress, and look at our an­cest­ors as hav­ing made a moral er­ror. Which seems easy enough to de­scribe in terms of should-ness: Our an­cest­ors thought that they should en­slave de­feated en­emies, but they were mis­taken.

But all our philo­soph­ical ar­gu­ments ul­ti­mately seem to ground in state­ments that no one has bothered to jus­tify—ex­cept per­haps to plead that they are self-evid­ent, or that any reas­on­able mind must surely agree, or that they are a pri­ori truths, or some such. Per­haps, then, all our moral be­liefs are as er­ro­neous as that old bit about slavery? Per­haps we have en­tirely mis­per­ceived the flow­ing streams of should?

So I once be­lieved was plaus­ible; and one of the ar­gu­ments I wish I could go back and say to my­self, is, “If you know noth­ing at all about should-ness, then how do you know that the pro­ced­ure, ‘Do whatever Em­peror Ming says’ is not the en­tirety of should-ness? Or even worse, per­haps, the pro­ced­ure, ‘Do whatever max­im­izes in­clus­ive ge­netic fit­ness’ or ‘Do whatever makes you per­son­ally happy’.” The point here would have been to make my past self see that in re­ject­ing these rules, he was as­sert­ing a kind of know­ledge—that to say, “This is not mor­al­ity,” he must re­veal that, des­pite him­self, he knows some­thing about mor­al­ity or meta-mor­al­ity. Other­wise, the pro­ced­ure “Do whatever Em­peror Ming says” would seem just as plaus­ible, as a guid­ing prin­ciple, as his cur­rent path of “Re­ject­ing things that seem un­jus­ti­fied.” Un­jus­ti­fied—ac­cord­ing to what cri­terion of jus­ti­fic­a­tion? Why trust the prin­ciple that says that moral state­ments need to be jus­ti­fied, if you know noth­ing at all about mor­al­ity?

What in­deed would dis­tin­guish, at all, the ques­tion “What is right?” from “What is wrong?”

What is “right”, if you can’t say “good” or “de­sir­able” or “bet­ter” or “prefer­able” or “moral” or “should”? What hap­pens if you try to carry out the op­er­a­tion of re­pla­cing the sym­bol with what it stands for?

If you’re guess­ing that I’m try­ing to in­veigle you into let­ting me say: “Well, there are just some things that are baked into the ques­tion, when you start ask­ing ques­tions about mor­al­ity, rather than wakalixes or toaster ovens”, then you would be right. I’ll be mak­ing use of that later, and, yes, will ad­dress “But why should we ask that ques­tion?”

Okay, now: mor­al­ity-goggles off, re­duc­tion-goggles on.

Those who re­mem­ber Poss­ib­il­ity and Could-ness, or those fa­mil­iar with simple search tech­niques in AI, will real­ize that the “should” la­bel is be­hav­ing like the in­verse of the “could” la­bel, which we pre­vi­ously ana­lyzed in terms of “reach­ab­il­ity”. Reach­ab­il­ity spreads for­ward in time: if I could reach the state with the but­ton pressed, I could reach the state with the string pulled; if I could reach the state with the string pulled, I could reach the state with the switch flipped.

Where the “could” la­bel and the “should” la­bel col­lide, the al­gorithm pro­duces a plan.

Now, as I say this, I sus­pect that at least some read­ers may find them­selves fear­ing that I am about to re­duce should-ness to a mere ar­ti­fact of a way that a plan­ning sys­tem feels from in­side. Once again I urge you to check Changing Your Metaethics, if this starts to hap­pen. Re­mem­ber above all the Moral Void: Even if there were no mor­al­ity, you could still choose to help people rather than hurt them. This, above all, holds in place what you hold pre­cious, while your be­liefs about the nature of mor­al­ity change.

I do not in­tend, with this post, to take away any­thing of value; it will all be given back be­fore the end.

Now this al­gorithm is not very soph­ist­ic­ated, as AI al­gorithms go, but to ap­ply it in full gen­er­al­ity—to learned in­form­a­tion, not just an­ces­trally en­countered, ge­net­ic­ally pro­grammed situ­ations—is a rare thing among an­im­als. Put a food re­ward in a trans­par­ent box. Put the match­ing key, which looks unique and uniquely cor­res­ponds to that box, in an­other trans­par­ent box. Put the unique key to that box in an­other box. Do this with five boxes. Mix in an­other se­quence of five boxes that doesn’t lead to a food re­ward. Then of­fer a choice of two keys, one of which starts the se­quence of five boxes lead­ing to food, one of which starts the se­quence lead­ing nowhere.

Chim­pan­zees can learn to do this, but so far as I know, no non-prim­ate spe­cies can pull that trick.

And as smart as chim­pan­zees are, they are not quite as good as hu­mans at in­vent­ing plans—plans such as, for ex­ample, plant­ing in the spring to har­vest in the fall.

So what else are hu­mans do­ing, in the way of plan­ning?

It is a gen­eral ob­ser­va­tion that nat­ural se­lec­tion seems to re­use ex­ist­ing com­plex­ity, rather than cre­at­ing things from scratch, whenever it pos­sibly can—though not al­ways in the same way that a hu­man en­gin­eer would. It is a func­tion of the enorm­ous time re­quired for evol­u­tion to cre­ate ma­chines with many in­ter­de­pend­ent parts, and the vastly shorter time re­quired to cre­ate a mutated copy of some­thing already evolved.

What else are hu­mans do­ing? Quite a bit, and some of it I don’t un­der­stand—there are plans hu­mans make, that no mod­ern-day AI can.

But one of the things we are do­ing, is reas­on­ing about “right-ness” the same way we would reason about any other ob­serv­able prop­erty.

Are an­im­als with bright col­ors of­ten pois­on­ous? Does the de­li­cious nid-nut grow only in the spring? Is it usu­ally a good idea to take with a wa­ter­skin on long hunts?

It seems that Martha and Fred have an ob­lig­a­tion to take care of their child, and Jane and Bob are ob­lig­ated to take care of their child, and Susan and Wilson have a duty to care for their child. Could it be that par­ents in gen­eral must take care of their chil­dren?

By rep­res­ent­ing right-ness as an at­trib­ute of ob­jects, you can re­cruit a whole pre­vi­ously evolved sys­tem that reas­ons about the at­trib­utes of ob­jects. You can save quite a lot of plan­ning time, if you de­cide (based on ex­per­i­ence) that in gen­eral it is a good idea to take a wa­ter­skin on hunts, from which it fol­lows that it must be a good idea to take a wa­ter­skin on hunt #342.

Is this dam­nable for a Mind Pro­jec­tion Fal­lacy—treat­ing prop­er­ties of the mind as if they were out there in the world?

Depends on how you look at it.

This busi­ness of, “It’s been a good idea to take wa­ter­skins on the last three hunts, maybe it’s a good idea in gen­eral, if so it’s a good idea to take a wa­ter­skin on this hunt”, does seem to work.

Let’s say that your mind, faced with any count­able set of ob­jects, auto­mat­ic­ally and per­cep­tu­ally tagged them with their re­mainder mod­ulo 5. If you saw a group of 17 ob­jects, for ex­ample, they would look re­mainder-2-ish. Though, if you didn’t have any no­tion of what your neur­ons were do­ing, and per­haps no no­tion of mod­ulo arith­metic, you would only see that the group of 17 ob­jects had the same re­mainder-ness as a group of 2 ob­jects. You might not even know how to count—your brain do­ing the whole thing auto­mat­ic­ally, sub­con­sciously and neur­ally—in which case you would just have five dif­fer­ent words for the re­mainder-ness at­trib­utes that we would call 0, 1, 2, 3, and 4.

If you look out upon the world you see, and guess that re­mainder-ness is a sep­ar­ate and ad­di­tional at­trib­ute of things—like the at­trib­ute of hav­ing an elec­tric charge—or like a tiny little XML tag hanging off of things—then you will be wrong. But this does not mean it is non­sense to talk about re­mainder-ness, or that you must auto­mat­ic­ally com­mit the Mind Pro­jec­tion Fal­lacy in do­ing so. So long as you’ve got a well-defined way to com­pute a prop­erty, it can have a well-defined out­put and hence an em­pir­ical truth con­di­tion.

If you’re look­ing at 17 ob­jects, then their re­mainder-ness is, in­deed and truly, 2, and not 0, 3, 4, or 1. If I tell you, “Those red things you told me to look at are re­mainder-2-ish”, you have in­deed been told a falsifi­able and em­pir­ical prop­erty of those red things. It is just not a sep­ar­ate, ad­di­tional, phys­ic­ally ex­ist­ent at­trib­ute.

And as for reas­on­ing about de­rived prop­er­ties, and which other in­her­ent or de­rived prop­er­ties they cor­rel­ate to—I don’t see any­thing in­her­ently fal­la­cious about that.

One may no­tice, for ex­ample, that things which are 7 mod­ulo 10 are of­ten also 2 mod­ulo 5. Em­pir­ical ob­ser­va­tions of this sort play a large role in math­em­at­ics, sug­gest­ing the­or­ems to prove. (See Polya’s How To Solve It.)

Indeed, vir­tu­ally all the ex­per­i­ence we have, is de­rived by com­plic­ated neural com­pu­ta­tions from the raw phys­ical events impinging on our sense or­gans. By the time you see any­thing, it has been ex­tens­ively pro­cessed by the ret­ina, lat­eral gen­icu­late nuc­leus, visual cor­tex, pari­etal cor­tex, and tem­poral cor­tex, into a very com­plex sort of de­rived com­pu­ta­tional prop­erty.

If you thought of a prop­erty like red­ness as resid­ing strictly in an apple, you would be com­mit­ting the Mind Pro­jec­tion Fal­lacy. The apple’s sur­face has a re­flect­ance which sends out a mix­ture of wavelengths that im­pinge on your ret­ina and are pro­cessed with re­spect to am­bi­ent light to ex­tract a sum­mary color of red… But if you tell me that the apple is red, rather than green, and make no claims as to whether this is an on­to­lo­gic­ally fun­da­mental phys­ical at­trib­ute of the apple, then I am quite happy to agree with you.

So as long as there is a stable com­pu­ta­tion in­volved, or a stable pro­cess—even if you can’t con­sciously verb­al­ize the spe­cific­a­tion—it of­ten makes a great deal of sense to talk about prop­er­ties that are not fun­da­mental. And reason about them, and re­mem­ber where they have been found in the past, and guess where they will be found next.

(In ret­ro­spect, that should have been a sep­ar­ate post in the Re­duc­tion­ism se­quence. “De­rived Prop­er­ties”, or “Com­pu­ta­tional Prop­er­ties” maybe. Oh, well; I prom­ised you mor­al­ity this day, and this day mor­al­ity you shall have.)

Now let’s say we want to make a little ma­chine, one that will save the lives of chil­dren. (This en­ables us to save more chil­dren than we could do without a ma­chine, just like you can move more dirt with a shovel than by hand.) The ma­chine will be a plan­ning ma­chine, and it will reason about events that may or may not have the prop­erty, leads-to-child-liv­ing.

A simple plan­ning ma­chine would just have a pre-made model of the en­vir­on­mental pro­cess. It would search for­ward from its ac­tions, ap­ply­ing a la­bel that we might call “reach­able-from-ac­tion-ness”, but which might as well say “Xybliz” in­tern­ally for all that it mat­ters to the pro­gram. And it would search back­ward from scen­arios, situ­ations, in which the child lived, la­beling these “leads-to-child-liv­ing”. If situ­ation X leads to situ­ation Y, and Y has the la­bel “leads-to-child-liv­ing”—which might just be a little flag bit, for all the dif­fer­ence it would make—then X will in­herit the flag from Y. When the two la­bels meet in the middle, the leads-to-child-liv­ing flag will quickly trace down the stored path of reach­ab­il­ity, un­til fi­nally some par­tic­u­lar se­quence of ac­tions ends up labeled “leads-to-child-liv­ing”. Then the ma­chine auto­mat­ic­ally ex­ecutes those ac­tions—that’s just what the ma­chine does.

Now this ma­chine is not com­plic­ated enough to feel ex­ist­en­tial angst. It is not com­plic­ated enough to com­mit the Mind Pro­jec­tion Fal­lacy. It is not, in fact, com­plic­ated enough to reason ab­stractly about the prop­erty “leads-to-child-liv­ing-ness”. The ma­chine—as spe­cified so far—does not no­tice if the ac­tion “jump in the air” turns out to al­ways have this prop­erty, or never have this prop­erty. If “jump in the air” al­ways led to situ­ations in which the child lived, this could greatly sim­plify fu­ture plan­ning—but only if the ma­chine were soph­ist­ic­ated enough to no­tice this fact and use it.

If it is a fact that “jump in the air” “leads-to-child-liv­ing-ness”, this fact is com­posed of em­pir­ical truth and lo­gical truth. It is an em­pir­ical truth that if the world is such that if you per­form the (ideal ab­stract) al­gorithm “trace back from situ­ations where the child lives”, then it will be a lo­gical truth about the out­put of this (ideal ab­stract) al­gorithm that it la­bels the “jump in the air” ac­tion.

(You can­not al­ways define this fact in en­tirely em­pir­ical terms, by look­ing for the phys­ical real-world co­in­cid­ence of jump­ing and child sur­vival. It might be that “stomp left” also al­ways saves the child, and the ma­chine in fact stomps left. In which case the fact that jump­ing in the air would have saved the child, is a coun­ter­fac­tual ex­tra­pol­a­tion.)

Okay, now we’re ready to bridge the levels.

As you must surely have guessed by now, this should-ness stuff is how the hu­man de­cision al­gorithm feels from in­side. It is not an ex­tra, phys­ical, on­to­lo­gic­ally fun­da­mental at­trib­ute hanging off of events like a tiny little XML tag.

But it is a moral ques­tion what we should do about that—how we should re­act to it.

To ad­opt an at­ti­tude of com­plete ni­hil­ism, be­cause we wanted those tiny little XML tags, and they’re not phys­ic­ally there, strikes me as the wrong move. It is like sup­pos­ing that the ab­sence of an XML tag, equates to the XML tag be­ing there, say­ing in its tiny brack­ets what value we should at­tach, and hav­ing value zero. And then this value zero, in turn, equat­ing to a moral im­per­at­ive to wear black, feel aw­ful, write gloomy po­etry, be­tray friends, and com­mit sui­cide.

No.

So what would I say in­stead?

The force be­hind my an­swer is con­tained in The Moral Void and The Gift We Give To To­mor­row. I would try to save lives “even if there were no mor­al­ity”, as it were.

And it seems like an aw­ful shame to—after so many mil­lions and hun­dreds of mil­lions of years of evol­u­tion—after the moral mir­acle of so much cut­throat ge­netic com­pet­i­tion pro­du­cing in­tel­li­gent minds that love, and hope, and ap­pre­ci­ate beauty, and cre­ate beauty—after com­ing so far, to throw away the Gift of mor­al­ity, just be­cause our brain happened to rep­res­ent mor­al­ity in such fash­ion as to po­ten­tially mis­lead us when we re­flect on the nature of mor­al­ity.

This little ac­ci­dent of the Gift doesn’t seem like a good reason to throw away the Gift; it cer­tainly isn’t a in­es­cap­able lo­gical jus­ti­fic­a­tion for wear­ing black.

Why not keep the Gift, but ad­just the way we re­flect on it?

So here’s my metaethics:

I earlier asked,

What is “right”, if you can’t say “good” or “de­sir­able” or “bet­ter” or “prefer­able” or “moral” or “should”? What hap­pens if you try to carry out the op­er­a­tion of re­pla­cing the sym­bol with what it stands for?

I an­swer that if you try to re­place the sym­bol “should” with what it stands for, you end up with quite a large sen­tence.

For the much sim­pler save-life ma­chine, the “should” la­bel stands for leads-to-child-liv­ing-ness.

For a hu­man this is a much huger blob of a com­pu­ta­tion that looks like, “Did every­one sur­vive? How many people are happy? Are people in con­trol of their own lives? …” Hu­mans have com­plex emo­tions, have many val­ues—the thou­sand shards of de­sire, the god­shat­ter of nat­ural se­lec­tion. I would say, by the way, that the huge blob of a com­pu­ta­tion is not just my present ter­minal val­ues (which I don’t really have—I am not a con­sist­ent ex­pec­ted util­ity max­im­izers); the huge blob of a com­pu­ta­tion in­cludes the spe­cific­a­tion of those moral ar­gu­ments, those jus­ti­fic­a­tions, that would sway me if I heard them. So that I can re­gard my present val­ues, as an ap­prox­im­a­tion to the ideal mor­al­ity that I would have if I heard all the ar­gu­ments, to whatever ex­tent such an ex­tra­pol­a­tion is co­her­ent.

No one can write down their big com­pu­ta­tion; it is not just too large, it is also un­known to its user. No more could you print out a list­ing of the neur­ons in your brain. You never men­tion your big com­pu­ta­tion—you only use it, every hour of every day.

Now why might one identify this enorm­ous ab­stract com­pu­ta­tion, with what-is-right?

If you identify right­ness with this huge com­pu­ta­tional prop­erty, then moral judg­ments are sub­junct­ively ob­ject­ive (like math), sub­ject­ively ob­ject­ive (like prob­ab­il­ity), and cap­able of be­ing true (like coun­ter­fac­tu­als).

You will find your­self say­ing, “If I wanted to kill someone—even if I thought it was right to kill someone—that wouldn’t make it right.” Why? Be­cause what is right is a huge com­pu­ta­tional prop­erty—an ab­stract com­pu­ta­tion—not tied to the state of any­one’s brain, in­clud­ing your own brain.

This dis­tinc­tion was in­tro­duced earlier in 2-Place and 1-Place Words. We can treat the word “sexy” as a 2-place func­tion that goes out and hoovers up someone’s sense of sex­i­ness, and then eats an ob­ject of ad­mir­a­tion. Or we can treat the word “sexy” as mean­ing a 1-place func­tion, a par­tic­u­lar sense of sex­i­ness, like Sex­i­ness_20934, that only ac­cepts one ar­gu­ment, an ob­ject of ad­mir­a­tion.

Here we are treat­ing mor­al­ity as a 1-place func­tion. It does not ac­cept a per­son as an ar­gu­ment, spit out whatever cog­nit­ive al­gorithm they use to choose between ac­tions, and then ap­ply that al­gorithm to the situ­ation at hand. When I say right, I mean a cer­tain par­tic­u­lar 1-place func­tion that just asks, “Did the child live? Did any­one else get killed? Are people happy? Are they in con­trol of their own lives? Has justice been served?” … and so on through many, many other ele­ments of right­ness. (And per­haps those ar­gu­ments that might per­suade me oth­er­wise, which I have not heard.)

Hence the no­tion, “Re­place the sym­bol with what it stands for.”

Since what’s right is a 1-place func­tion, if I sub­junct­ively ima­gine a world in which someone has slipped me a pill that makes me want to kill people, then, in this sub­junct­ive world, it is not right to kill people. That’s not merely be­cause I’m judging with my cur­rent brain. It’s be­cause when I say right, I am re­fer­ring to a 1-place func­tion. Right­ness doesn’t go out and hoover up the cur­rent state of my brain, in this sub­junct­ive world, be­fore pro­du­cing the judg­ment “Oh, wait, it’s now okay to kill people.” When I say right, I don’t mean “that which my fu­ture self wants”, I mean the func­tion that looks at a situ­ation and asks, “Did any­one get killed? Are people happy? Are they in con­trol of their own lives? …”

And once you’ve defined a par­tic­u­lar ab­stract com­pu­ta­tion that says what is right—or even if you haven’t defined it, and it’s com­puted in some part of your brain you can’t per­fectly print out, but the com­pu­ta­tion is stable—more or less—then as with any other de­rived prop­erty, it makes sense to speak of a moral judg­ment be­ing true. If I say that today was a good day, you’ve learned some­thing em­pir­ical and falsifi­able about my day—if it turns out that ac­tu­ally my grand­mother died, you will sus­pect that I was ori­gin­ally ly­ing.

The ap­par­ent ob­jectiv­ity of mor­al­ity has just been ex­plained—and not ex­plained away. For in­deed, if someone slipped me a pill that made me want to kill people, non­ethe­less, it would not be right to kill people. Per­haps I would ac­tu­ally kill people, in that situ­ation—but that is be­cause some­thing other than mor­al­ity would be con­trolling my ac­tions.

Mor­al­ity is not just sub­junct­ively ob­ject­ive, but sub­ject­ively ob­ject­ive. I ex­per­i­ence it as some­thing I can­not change. Even after I know that it’s my­self who com­putes this 1-place func­tion, and not a rock some­where—even after I know that I will not find any star or moun­tain that com­putes this func­tion, that only upon me is it writ­ten—even so, I find that I wish to save lives, and that even if I could change this by an act of will, I would not choose to do so. I do not wish to re­ject joy, or beauty, or free­dom. What else would I do in­stead? I do not wish to re­ject the Gift that nat­ural se­lec­tion ac­ci­dent­ally barfed into me. This is the prin­ciple of The Moral Void and The Gift We Give To To­mor­row.

Our ori­gins may seem un­at­tract­ive, our brains un­trust­worthy.

But love has to enter the uni­verse some­how, start­ing from non-love, or love can­not enter time.

And if our brains are un­trust­worthy, it is only our own brains that say so. Do you some­times think that hu­man be­ings are not very nice? Then it is you, a hu­man be­ing, who says so. It is you, a hu­man be­ing, who judges that hu­man be­ings could do bet­ter. You will not find such writ­ten upon the stars or the moun­tains: they are not minds, they can­not think.

In this, of course, we find a jus­ti­fic­a­tional strange loop through the meta-level. Which is un­avoid­able so far as I can see—you can’t ar­gue mor­al­ity, or any kind of goal op­tim­iz­a­tion, into a rock. But note the ex­act struc­ture of this strange loop: there is no gen­eral moral prin­ciple which says that you should do what evol­u­tion pro­grammed you to do. There is, in­deed, no gen­eral prin­ciple to trust your moral in­tu­itions! You can find a moral in­tu­ition within your­self, de­scribe it—quote it—con­sider it de­lib­er­ately and in the full light of your en­tire mor­al­ity, and re­ject it, on grounds of other ar­gu­ments. What counts as an ar­gu­ment is also built into the right­ness-func­tion.

Just as, in the strange loop of ra­tion­al­ity, there is no gen­eral prin­ciple in ra­tion­al­ity to trust your brain, or to be­lieve what evol­u­tion pro­grammed you to be­lieve—but in­deed, when you ask which parts of your brain you need to rebel against, you do so us­ing your cur­rent brain. When you ask whether the uni­verse is simple, you can con­sider the simple hy­po­thesis that the uni­verse’s ap­par­ent sim­pli­city is ex­plained by its ac­tual sim­pli­city.

Rather than try­ing to un­wind ourselves into rocks, I pro­posed that we should use the full strength of our cur­rent ra­tion­al­ity, in re­flect­ing upon ourselves—that no part of ourselves be im­mune from ex­am­in­a­tion, and that we use all of ourselves that we cur­rently be­lieve in to ex­am­ine it.

You would do the same thing with mor­al­ity; if you con­sider that a part of your­self might be con­sidered harm­ful, then use your best cur­rent guess at what is right, your full moral strength, to do the con­sid­er­ing. Why should we want to un­wind ourselves to a rock? Why should we do less than our best, when re­flect­ing? You can’t un­wind past Oc­cam’s Razor, modus pon­ens, or mor­al­ity and it’s not clear why you should try.

For any part of right­ness, you can al­ways ima­gine an­other part that over­rides it—it would not be right to drag the child from the train tracks, if this res­ul­ted in every­one on Earth be­com­ing un­able to love—or so I would judge. For every part of right­ness you ex­am­ine, you will find that it can­not be the sole and per­fect and only cri­terion of right­ness. This may lead to the in­cor­rect in­fer­ence that there is some­thing bey­ond, some per­fect and only cri­terion from which all the oth­ers are de­rived—but that does not fol­low. The whole is the sum of the parts. We ran into an ana­log­ous situ­ation with free will, where no part of ourselves seems per­fectly de­cis­ive.

The clas­sic di­lemma for those who would trust their moral in­tu­itions, I be­lieve, is the one who says: “In­ter­ra­cial mar­riage is re­pug­nant—it dis­gusts me—and that is my moral in­tu­ition!” I reply, “There is no gen­eral rule to obey your in­tu­itions. You just men­tioned in­tu­itions, rather than us­ing them. Very few people have le­git­im­ate cause to men­tion in­tu­itions—Friendly AI pro­gram­mers, for ex­ample, delving into the cog­nit­ive sci­ence of things, have a le­git­im­ate reason to men­tion them. Every­one else just has or­din­ary moral ar­gu­ments, in which they use their in­tu­itions, for ex­ample, by say­ing, ‘An in­ter­ra­cial mar­riage doesn’t hurt any­one, if both parties con­sent’. I do not say, ‘And I have an in­tu­ition that any­thing con­sent­ing adults do is right, and all in­tu­itions must be obeyed, there­fore I win.’ I just of­fer up that ar­gu­ment, and any oth­ers I can think of, to weigh in the bal­ance.”

Indeed, evol­u­tion that made us can­not be trus­ted—so there is no gen­eral prin­ciple to trust it! Right­ness is not defined in terms of auto­matic cor­res­pond­ence to any pos­sible de­cision we ac­tu­ally make—so there’s no gen­eral prin­ciple that says you’re in­fal­lible! Just do what is, ahem, right—to the best of your abil­ity to weigh the ar­gu­ments you have heard, and pon­der the ar­gu­ments you may not have heard.

If you were hop­ing to have a per­fectly trust­worthy sys­tem, or to have been cre­ated in cor­res­pond­ence with a per­fectly trust­worthy mor­al­ity—well, I can’t give that back to you; but even most re­li­gions don’t try that one. Even most re­li­gions have the hu­man psy­cho­logy con­tain­ing ele­ments of sin, and even most re­li­gions don’t ac­tu­ally give you an ef­fect­ively ex­ecut­able and per­fect pro­ced­ure, though they may tell you “Con­sult the Bible! It al­ways works!”

If you hoped to find a source of mor­al­ity out­side hu­man­ity—well, I can’t give that back, but I can ask once again: Why would you even want that? And what good would it do? Even if there were some great light in the sky—some­thing that could tell us, “Sorry, hap­pi­ness is bad for you, pain is bet­ter, now get out there and kill some ba­bies!“—it would still be your own de­cision to fol­low it. You can­not evade re­spons­ib­il­ity.

There isn’t enough mys­tery left to jus­tify reas­on­able doubt as to whether the causal ori­gin of mor­al­ity is some­thing out­side hu­man­ity. We have evol­u­tion­ary psy­cho­logy. We know where mor­al­ity came from. We pretty much know how it works, in broad out­line at least. We know there are no little XML value tags on elec­trons (and in­deed, even if you found them, why should you pay at­ten­tion to what is writ­ten there?)

If you hoped that mor­al­ity would be uni­ver­sal­iz­able—sorry, that one I really can’t give back. Well, un­less we’re just talk­ing about hu­mans. Between neur­o­lo­gic­ally in­tact hu­mans, there is in­deed much cause to hope for over­lap and co­her­ence; and a great and reas­on­able doubt as to whether any present dis­agree­ment is really un­resolv­able, even it seems to be about “val­ues”. The ob­vi­ous reason for hope is the psy­cho­lo­gical unity of hu­man­kind, and the in­tu­itions of sym­metry, uni­ver­sal­iz­ab­il­ity, and sim­pli­city that we ex­ecute in the course of our moral ar­gu­ments. (In ret­ro­spect, I should have done a post on In­ter­per­sonal Mor­al­ity be­fore this...)

If I tell you that three people have found a pie and are ar­guing about how to di­vide it up, the thought “Give one-third of the pie to each” is bound to oc­cur to you—and if the three people are hu­mans, it’s bound to oc­cur to them, too. If one of them is a psy­cho­path and in­sists on get­ting the whole pie, though, there may be noth­ing for it but to say: “Sorry, fair­ness is not ‘what every­one thinks is fair’, fair­ness is every­one get­ting a third of the pie”. You might be able to re­solve the re­main­ing dis­agree­ment by polit­ics and game the­ory, short of vi­ol­ence—but that is not the same as com­ing to agree­ment on val­ues. (Maybe you could per­suade the psy­cho­path that tak­ing a pill to be more hu­man, if one were avail­able, would make them hap­pier? Would you be jus­ti­fied in for­cing them to swal­low the pill? These get us into stranger wa­ters that de­serve a sep­ar­ate post.)

If I define right­ness to in­clude the space of ar­gu­ments that move me, then when you and I ar­gue about what is right, we are ar­guing our ap­prox­im­a­tions to what we would come to be­lieve if we knew all em­pir­ical facts and had a mil­lion years to think about it—and that might be a lot closer than the present and heated ar­gu­ment. Or it might not. This gets into the no­tion of ‘con­stru­ing an ex­tra­pol­ated vo­li­tion’ which would be, again, a sep­ar­ate post.

But if you were step­ping out­side the hu­man and hop­ing for moral ar­gu­ments that would per­suade any pos­sible mind, even a mind that just wanted to max­im­ize the num­ber of pa­per­clips in the uni­verse, then sorry—the space of pos­sible mind designs is too large to per­mit uni­ver­sally com­pel­ling ar­gu­ments. You are bet­ter off treat­ing your in­tu­ition that your moral ar­gu­ments ought to per­suade oth­ers, as ap­ply­ing only to other hu­mans who are more or less neur­o­lo­gic­ally in­tact. Try­ing it on hu­man psy­cho­paths would be dan­ger­ous, yet per­haps pos­sible. But a pa­per­clip max­im­izer is just not the sort of mind that would be moved by a moral ar­gu­ment. (This will def­in­itely be a sep­ar­ate post.)

Once, in my wild and reck­less youth, I tried du­ti­fully—I thought it was my duty—to be ready and will­ing to fol­low the dic­tates of a great light in the sky, an ex­ternal ob­ject­ive mor­al­ity, when I dis­covered it. I ques­tioned everything, even al­tru­ism to­ward hu­man lives, even the value of hap­pi­ness. Fin­ally I real­ized that there was no found­a­tion but hu­man­ity—no evid­ence point­ing to even a reas­on­able doubt that there was any­thing else—and in­deed I shouldn’t even want to hope for any­thing else—and in­deed would have no moral cause to fol­low the dic­tates of a light in the sky, even if I found one.

I didn’t get back im­me­di­ately all the pieces of my­self that I had tried to de­prec­ate—it took time for the real­iz­a­tion “There is noth­ing else” to sink in. The no­tion that hu­man­ity could just… you know… live and have fun… seemed much too good to be true, so I mis­trus­ted it. But even­tu­ally, it sank in that there really was noth­ing else to take the place of beauty. And then I got it back.

So you see, it all really does add up to moral nor­mal­ity, very ex­actly in fact. You go on with the same mor­als as be­fore, and the same moral ar­gu­ments as be­fore. There is no sud­den Grand Over­lord Pro­ced­ure to which you can ap­peal to get a per­fectly trust­worthy an­swer. You don’t know, can­not print out, the great right­ness-func­tion; and even if you could, you would not have enough com­pu­ta­tional power to search the en­tire spe­cified space of ar­gu­ments that might move you. You will just have to ar­gue it out.

I sus­pect that a fair num­ber of those who pro­pound metaethics do so in or­der to have it add up to some new and un­usual moral—else why would they bother? In my case, I bother be­cause I am a Friendly AI pro­gram­mer and I have to make a phys­ical sys­tem out­side my­self do what’s right; for which pur­pose metaethics be­comes very im­port­ant in­deed. But for the most part, the ef­fect of my proffered metaethic is threefold:

  • Anyone wor­ried that re­duc­tion­ism drains the mean­ing from ex­ist­ence can stop wor­ry­ing;

  • Anyone who was re­ject­ing parts of their hu­man ex­ist­ence based on strange metaethics—i.e., “Why should I care about oth­ers, if that doesn’t help me max­im­ize my in­clus­ive ge­netic fit­ness?“—can wel­come back all the parts of them­selves that they once ex­iled.

  • You can stop ar­guing about metaethics, and go back to whatever or­din­ary moral ar­gu­ment you were hav­ing be­fore then. This know­ledge will help you avoid metaethical mis­takes that mess up moral ar­gu­ments, but you can’t ac­tu­ally use it to settle de­bates un­less you can build a Friendly AI.

And, oh yes—why is it right to save a child’s life?

Well… you could ask “Is this event that just happened, right?” and find that the child had sur­vived, in which case you would have dis­covered the nonob­vi­ous em­pir­ical fact about the world, that it had come out right.

Or you could start out already know­ing a com­plic­ated state of the world, but still have to ap­ply the right­ness-func­tion to it in a non­trivial way—one in­volving a com­plic­ated moral ar­gu­ment, or ex­tra­pol­at­ing con­sequences into the fu­ture—in which case you would learn the nonob­vi­ous lo­gical /​ com­pu­ta­tional fact that right­ness, ap­plied to this situ­ation, yiel­ded thumbs-up.

In both these cases, there are nonob­vi­ous facts to learn, which seem to ex­plain why what just happened is right.

But if you ask “Why is it good to be happy?” and then re­place the sym­bol ‘good’ with what it stands for, you’ll end up with a ques­tion like “Why does hap­pi­ness match {hap­pi­ness + sur­vival + justice + in­di­vidu­al­ity + …}?” This gets com­puted so fast, that it scarcely seems like there’s any­thing there to be ex­plained. It’s like ask­ing “Why does 4 = 4?” in­stead of “Why does 2 + 2 = 4?”

Now, I bet that feels quite a bit like what hap­pens when I ask you: “Why is hap­pi­ness good?”

Right?

And that’s also my an­swer to Moore’s Open Ques­tion. Why is this big func­tion I’m talk­ing about, right? Be­cause when I say “that big func­tion”, and you say “right”, we are derefer­en­cing two dif­fer­ent point­ers to the same un­verb­al­iz­able ab­stract com­pu­ta­tion. I mean, that big func­tion I’m talk­ing about, hap­pens to be the same thing that la­bels things right in your own brain. You might re­flect on the pieces of the quo­ta­tion of the big func­tion, but you would start out by us­ing your sense of right-ness to do it. If you had the per­fect em­pir­ical know­ledge to ta­boo both “that big func­tion” and “right”, sub­sti­tute what the point­ers stood for, and write out the full enorm­ity of the res­ult­ing sen­tence, it would come out as… sorry, I can’t res­ist this one… A=A.

Part of The Metaethics Sequence

Next post: “In­ter­per­sonal Mor­al­ity

Pre­vi­ous post: “Set­ting Up Metaethics