By Which It May Be Judged

Fol­lowup to: Mixed Refer­ence: The Great Re­duc­tion­ist Project

Hu­mans need fan­tasy to be hu­man.

“Tooth fairies? Hog­fathers? Lit­tle—”

Yes. As prac­tice. You have to start out learn­ing to be­lieve the lit­tle lies.

“So we can be­lieve the big ones?”

Yes. Jus­tice. Mercy. Duty. That sort of thing.

“They’re not the same at all!”

You think so? Then take the uni­verse and grind it down to the finest pow­der and sieve it through the finest sieve and then show me one atom of jus­tice, one molecule of mercy.

- Su­san and Death, in Hog­father by Terry Pratchett

Sup­pose three peo­ple find a pie—that is, three peo­ple ex­actly si­mul­ta­neously spot a pie which has been ex­oge­nously gen­er­ated in un­claimed ter­ri­tory. Zaire wants the en­tire pie; Yancy thinks that 13 each is fair; and Xan­non thinks that fair would be tak­ing into equal ac­count ev­ery­one’s ideas about what is “fair”.

I my­self would say un­hesi­tat­ingly that a third of the pie each, is fair. “Fair­ness”, as an eth­i­cal con­cept, can get a lot more com­pli­cated in more elab­o­rate con­texts. But in this sim­ple con­text, a lot of other things that “fair­ness” could de­pend on, like work in­puts, have been elimi­nated or made con­stant. As­sum­ing no rele­vant con­di­tions other than those already stated, “fair­ness” sim­plifies to the math­e­mat­i­cal pro­ce­dure of split­ting the pie into equal parts; and when this log­i­cal func­tion is run over phys­i­cal re­al­ity, it out­puts “1/​3 for Zaire, 13 for Yancy, 13 for Xan­non”.

Or to put it an­other way—just like we get “If Oswald hadn’t shot Kennedy, no­body else would’ve” by run­ning a log­i­cal func­tion over a true causal model—similarly, we can get the hy­po­thet­i­cal ‘fair’ situ­a­tion, whether or not it ac­tu­ally hap­pens, by run­ning the phys­i­cal start­ing sce­nario through a log­i­cal func­tion that de­scribes what a ‘fair’ out­come would look like:

So am I (as Zaire would claim) just as­sum­ing-by-au­thor­ity that I get to have ev­ery­thing my way, since I’m not defin­ing ‘fair­ness’ the way Zaire wants to define it?

No more than math­e­mat­i­ci­ans are flatly or­der­ing ev­ery­one to as­sume-with­out-proof that two differ­ent num­bers can’t have the same suc­ces­sor. For fair­ness to be what ev­ery­one thinks is “fair” would be en­tirely cir­cu­lar, struc­turally iso­mor­phic to “Fzeem is what ev­ery­one thinks is fzeem”… or like try­ing to define the count­ing num­bers as “what­ever any­one thinks is a num­ber”. It only even looks co­her­ent be­cause ev­ery­one se­cretly already has a men­tal pic­ture of “num­bers”—be­cause their brain already nav­i­gated to the refer­ent. But some­thing akin to ax­ioms is needed to talk about “num­bers, as op­posed to some­thing else” in the first place. Even an in­choate men­tal image of “0, 1, 2, …” im­plies the ax­ioms no less than a for­mal state­ment—we can ex­tract the ax­ioms back out by ask­ing ques­tions about this rough men­tal image.

Similarly, the in­tu­ition that fair­ness has some­thing to do with di­vid­ing up the pie equally, plays a role akin to se­cretly already hav­ing “0, 1, 2, …” in mind as the sub­ject of math­e­mat­i­cal con­ver­sa­tion. You need ax­ioms, not as as­sump­tions that aren’t jus­tified, but as poin­t­ers to what the heck the con­ver­sa­tion is sup­posed to be about.

Mul­ti­ple philoso­phers have sug­gested that this stance seems similar to “rigid des­ig­na­tion”, i.e., when I say ‘fair’ it in­trin­si­cally, rigidly refers to some­thing-to-do-with-equal-di­vi­sion. I con­fess I don’t see it that way my­self—if some­body thinks of Eu­clidean ge­om­e­try when you ut­ter the sound “num-berz” they’re not do­ing any­thing false, they’re as­so­ci­at­ing the sound to a differ­ent log­i­cal thingy. It’s not about words with in­trin­si­cally rigid refer­en­tial power, it’s that the words are win­dow dress­ing on the un­der­ly­ing en­tities. I want to talk about a par­tic­u­lar log­i­cal en­tity, as it might be defined by ei­ther ax­ioms or in­choate images, re­gard­less of which word-sounds may be as­so­ci­ated to it. If you want to call that “rigid des­ig­na­tion”, that seems to me like adding a level of in­di­rec­tion; I don’t care about the word ‘fair’ in the first place, I care about the log­i­cal en­tity of fair­ness. (Or to put it even more sharply: since my on­tol­ogy does not have room for physics, logic, plus des­ig­na­tion, I’m not very in­ter­ested in dis­cussing this ‘rigid des­ig­na­tion’ busi­ness un­less it’s be­ing re­duced to some­thing else.)

Once is­sues of jus­tice be­come more com­pli­cated and all the con­tex­tual vari­ables get added back in, we might not be sure if a dis­agree­ment about ‘fair­ness’ re­flects:

  1. The equiv­a­lent of a mul­ti­pli­ca­tion er­ror within the same ax­ioms—in­cor­rectly di­vid­ing by 3. (Or more com­pli­cat­edly: You might have a so­phis­ti­cated ax­io­matic con­cept of ‘equity’, and in­cor­rectly pro­cess those ax­ioms to in­val­idly yield the as­ser­tion that, in a con­text where 2 of the 3 must starve and there’s only enough pie for at most 1 per­son to sur­vive, you should still di­vide the pie equally in­stead of flip­ping a 3-sided coin. Where I’m as­sum­ing that this con­clu­sion is ‘in­cor­rect’, not be­cause I dis­agree with it, but be­cause it didn’t ac­tu­ally fol­low from the ax­ioms.)

  2. Mis­taken mod­els of the phys­i­cal world fed into the func­tion—mis­tak­enly think­ing there’s 2 pies, or mis­tak­enly think­ing that Zaire has no sub­jec­tive ex­pe­riences and is not an ob­ject of eth­i­cal value.

  3. Peo­ple as­so­ci­at­ing differ­ent log­i­cal func­tions to the let­ters F-A-I-R, which isn’t a dis­agree­ment about some com­mon pin­pointed vari­able, but just differ­ent peo­ple want­ing differ­ent things.

There’s a lot of peo­ple who feel that this pic­ture leaves out some­thing fun­da­men­tal, es­pe­cially once we make the jump from “fair” to the broader con­cept of “moral”, “good”, or “right”. And it’s this worry about leav­ing-out-some­thing-fun­da­men­tal that I hope to ad­dress next...

...but please note, if we con­fess that ‘right’ lives in a world of physics and logic—be­cause ev­ery­thing lives in a world of physics and logic—then we have to trans­late ‘right’ into those terms some­how.

And that is the an­swer Su­san should have given—if she could talk about suffi­ciently ad­vanced episte­mol­ogy, suffi­ciently fast—to Death’s en­tire state­ment:

You think so? Then take the uni­verse and grind it down to the finest pow­der and sieve it through the finest sieve and then show me one atom of jus­tice, one molecule of mercy. And yet — Death waved a hand. And yet you act as if there is some ideal or­der in the world, as if there is some … right­ness in the uni­verse by which it may be judged.

“But!” Su­san should’ve said. “When we judge the uni­verse we’re com­par­ing it to a log­i­cal refer­ent, a sort of thing that isn’t in the uni­verse! Why, it’s just like look­ing at a heap of 2 ap­ples and a heap of 3 ap­ples on a table, and com­par­ing their in­visi­ble product to the num­ber 6 - there isn’t any 6 if you grind up the whole table, even if you grind up the whole uni­verse, but the product is still 6, physico-log­i­cally speak­ing.”


If you re­quire that Right­ness be writ­ten on some par­tic­u­lar great Stone Tablet some­where—to be “a light that shines from the sky”, out­side peo­ple, as a differ­ent Terry Pratch­ett book put it—then in­deed, there’s no such Stone Tablet any­where in our uni­verse.

But there shouldn’t be such a Stone Tablet, given stan­dard in­tu­itions about moral­ity. This fol­lows from the Euthryphro Dilemma out of an­cient Greece.

The origi­nal Euthryphro dilemma goes, “Is it pi­ous be­cause it is loved by the gods, or loved by the gods be­cause it is pi­ous?” The re­li­gious ver­sion goes, “Is it good be­cause it is com­manded by God, or does God com­mand it be­cause it is good?”

The stan­dard athe­ist re­ply is: “Would you say that it’s an in­trin­si­cally good thing—even if the event has no fur­ther causal con­se­quences which are good—to slaugh­ter ba­bies or tor­ture peo­ple, if that’s what God says to do?”

If we can’t make it good to slaugh­ter ba­bies by tweak­ing the state of God, then moral­ity doesn’t come from God; so goes the stan­dard athe­ist ar­gu­ment.

But if you can’t make it good to slaugh­ter ba­bies by tweak­ing the phys­i­cal state of any­thing—if we can’t imag­ine a world where some great Stone Tablet of Mo­ral­ity has been phys­i­cally rewrit­ten, and what is right has changed—then this is tel­ling us that...

(drum­roll)

...what’s “right” is a log­i­cal thingy rather than a phys­i­cal thingy, that’s all. The mark of a log­i­cal val­idity is that we can’t con­cretely vi­su­al­ize a co­her­ent pos­si­ble world where the propo­si­tion is false.

And I men­tion this in hopes that I can show that it is not moral anti-re­al­ism to say that moral state­ments take their truth-value from log­i­cal en­tities. Even in An­cient Greece, philoso­phers im­plic­itly knew that ‘moral­ity’ ought to be such an en­tity—that it couldn’t be some­thing you found when you ground the Uni­verse to pow­der, be­cause then you could re­sprin­kle the pow­der and make it won­der­ful to kill ba­bies—though they didn’t know how to say what they knew.


There’s a lot of peo­ple who still feel that Death would be right, if the uni­verse were all phys­i­cal; that the kind of dry log­i­cal en­tity I’m de­scribing here, isn’t suffi­cient to carry the bright al­ive feel­ing of good­ness.

And there are oth­ers who ac­cept that physics and logic is ev­ery­thing, but who—I think mis­tak­enly—go ahead and also ac­cept Death’s stance that this makes moral­ity a lie, or, in lesser form, that the bright al­ive feel­ing can’t make it. (Sort of like peo­ple who ac­cept an in­com­pat­i­bil­ist the­ory of free will, also ac­cept physics, and con­clude with sor­row that they are in­deed be­ing con­trol­led by physics.)

In case any­one is bored that I’m still try­ing to fight this bat­tle, well, here’s a quote from a re­cent Face­book con­ver­sa­tion with a fa­mous early tran­shu­man­ist:

No doubt a “crip­pled” AI that didn’t un­der­stand the ex­is­tence or na­ture of first-per­son facts could be non­friendly to­wards sen­tient be­ings… Only a zom­bie wouldn’t value Heaven over Hell. For rea­sons we sim­ply don’t un­der­stand, the nega­tive value and nor­ma­tive as­pect of agony and de­spair is built into the na­ture of the ex­pe­rience it­self. Non-re­duc­tion­ist? Yes, on a stan­dard ma­te­ri­al­ist on­tol­ogy. But not IMO within a more defen­si­ble Straw­so­nian phys­i­cal­ism.

It would ac­tu­ally be quite sur­pris­ingly helpful for in­creas­ing the per­centage of peo­ple who will par­ti­ci­pate mean­ingfully in sav­ing the planet, if there were some re­li­ably-work­ing stan­dard ex­pla­na­tion for why physics and logic to­gether have enough room to con­tain moral­ity. Peo­ple who think that re­duc­tion­ism means we have to lie to our chil­dren, as Pratch­ett’s Death ad­vo­cates, won’t be much en­thused about the Cen­ter for Ap­plied Ra­tion­al­ity. And there are a fair num­ber of peo­ple out there who still ad­vo­cate pro­ceed­ing in the con­fi­dence of in­ef­fable moral­ity to con­struct slop­pily de­signed AIs.

So far I don’t know of any ex­po­si­tion that works re­li­ably—for the the­sis for how moral­ity in­clud­ing our in­tu­itions about whether things re­ally are jus­tified and so on, is pre­served in the anal­y­sis to physics plus logic; that moral­ity has been ex­plained rather than ex­plained away. Nonethe­less I shall now take an­other stab at it, start­ing with a sim­pler bright feel­ing:


When I see an un­usu­ally neat math­e­mat­i­cal proof, un­ex­pect­edly short or sur­pris­ingly gen­eral, my brain gets a joy­ous sense of el­e­gance.

There’s pre­sum­ably some func­tional slice through my brain that im­ple­ments this emo­tion—some con­figu­ra­tion sub­space of spik­ing neu­ral cir­cuitry which cor­re­sponds to my feel­ing of el­e­gance. Per­haps I should say that el­e­gance is merely about my brain switch­ing on its el­e­gance-sig­nal? But there are con­cepts like Kol­mogorov com­plex­ity that give more for­mal mean­ings of “sim­ple” than “Sim­ple is what­ever makes my brain feel the emo­tion of sim­plic­ity.” Any­thing you do to fool my brain wouldn’t make the proof re­ally el­e­gant, not in that sense. The emo­tion is not free of se­man­tic con­tent; we could build a cor­re­spon­dence the­ory for it and nav­i­gate to its log­i­cal+phys­i­cal refer­ent, and say: “Sarah feels like this proof is el­e­gant, and her feel­ing is true.” You could even say that cer­tain proofs are el­e­gant even if no con­scious agent sees them.

My de­scrip­tion of ‘el­e­gance’ ad­mit­tedly did in­voke agent-de­pen­dent con­cepts like ‘un­ex­pect­edly’ short or ‘sur­pris­ingly’ gen­eral. It’s al­most cer­tainly true that with a differ­ent math­e­mat­i­cal back­ground, I would have differ­ent stan­dards of el­e­gance and ex­pe­rience that feel­ing on some­what differ­ent oc­ca­sions. Even so, that still seems like mov­ing around in a field of similar refer­ents for the emo­tion—much more similar to each other than to, say, the dis­tant cluster of ‘anger’.

Rewiring my brain so that the ‘el­e­gance’ sen­sa­tion gets ac­ti­vated when I see math­e­mat­i­cal proofs where the words have lots of vow­els—that wouldn’t change what is el­e­gant. Rather, it would make the feel­ing be about some­thing else en­tirely; differ­ent se­man­tics with a differ­ent truth-con­di­tion.

In­deed, it’s not clear that this thought ex­per­i­ment is, or should be, re­ally con­ceiv­able. If all the as­so­ci­ated com­pu­ta­tion is about vow­els in­stead of el­e­gance, then from the in­side you would ex­pect that to feel vow­elly, not feel el­e­gant...

...which is to say that even feel­ings can be as­so­ci­ated with log­i­cal en­tities. Though un­for­tu­nately not in any way that will feel like qualia if you can’t read your own source code. I could write out an ex­act de­scrip­tion of your vi­sual cor­tex’s spik­ing code for ‘blue’ on pa­per, and it wouldn’t ac­tu­ally look blue to you. Still, on the higher level of de­scrip­tion, it should seem in­tu­itively plau­si­ble that if you tried rewrit­ing the rele­vant part of your brain to count vow­els, the re­sult­ing sen­sa­tion would no longer have the con­tent or even the feel­ing of el­e­gance. It would com­pute vow­elli­ness, and feel vow­elly.


My feel­ing of math­e­mat­i­cal el­e­gance is mo­ti­vat­ing; it makes me more likely to search for similar such proofs later and go on do­ing math. You could con­struct an agent that tried to add more vow­els in­stead, and if the agent asked it­self why it was do­ing that, the re­sult­ing jus­tifi­ca­tion-thought wouldn’t feel like be­cause-it’s-el­e­gant, it would feel like be­cause-it’s-vow­elly.

In the same sense, when you try to do what’s right, you’re mo­ti­vated by things like (to yet again quote Frankena’s list of ter­mi­nal val­ues):

“Life, con­scious­ness, and ac­tivity; health and strength; plea­sures and satis­fac­tions of all or cer­tain kinds; hap­piness, beat­i­tude, con­tent­ment, etc.; truth; knowl­edge and true opinions of var­i­ous kinds, un­der­stand­ing, wis­dom; beauty, har­mony, pro­por­tion in ob­jects con­tem­plated; aes­thetic ex­pe­rience; morally good dis­po­si­tions or virtues; mu­tual af­fec­tion, love, friend­ship, co­op­er­a­tion; just dis­tri­bu­tion of goods and evils; har­mony and pro­por­tion in one’s own life; power and ex­pe­riences of achieve­ment; self-ex­pres­sion; free­dom; peace, se­cu­rity; ad­ven­ture and nov­elty; and good rep­u­ta­tion, honor, es­teem, etc.”

If we re­pro­grammed you to count pa­per­clips in­stead, it wouldn’t feel like differ­ent things hav­ing the same kind of mo­ti­va­tion be­hind it. It wouldn’t feel like do­ing-what’s-right for a differ­ent guess about what’s right. It would feel like do­ing-what-leads-to-pa­per­clips.

And I quoted the above list be­cause the feel­ing of right­ness isn’t about im­ple­ment­ing a par­tic­u­lar log­i­cal func­tion; it con­tains no men­tion of log­i­cal func­tions at all; in the en­vi­ron­ment of evolu­tion­ary an­ces­try no­body has heard of ax­iom­a­ti­za­tion; these feel­ings are about life, con­scious­ness, etcetera. If I could write out the whole truth-con­di­tion of the feel­ing in a way you could com­pute, you would still feel Moore’s Open Ques­tion: “I can see that this event is high-rated by log­i­cal func­tion X, but is X re­ally right?”—since you can’t read your own source code and the de­scrip­tion wouldn’t be com­men­su­rate with your brain’s na­tive for­mat.

“But!” you cry. “But, is it re­ally bet­ter to do what’s right, than to max­i­mize pa­per­clips?” Yes! As soon as you start try­ing to cash out the log­i­cal func­tion that gives bet­ter­ness its truth-value, it will out­put “life, con­scious­ness, etc. >B pa­per­clips”. And if your brain were com­put­ing a differ­ent log­i­cal func­tion in­stead, like makes-more-pa­per­clips, it wouldn’t feel bet­ter, it would feel more­clippy.

But is it re­ally jus­tified to keep our own sense of bet­ter­ness? Sure, and that’s a log­i­cal fact—it’s the ob­jec­tive out­put of the log­i­cal func­tion cor­re­spond­ing to your ex­pe­ri­en­tial sense of what it means for some­thing to be ‘jus­tified’ in the first place. This doesn’t mean that Clippy the Paper­clip Max­i­mizer will self-mod­ify to do only things that are jus­tified; Clippy doesn’t judge be­tween self-mod­ifi­ca­tions by com­put­ing jus­tifi­ca­tions, but rather, com­put­ing clip­pyflurphs.

But isn’t it ar­bi­trary for Clippy to max­i­mize pa­per­clips? In­deed; once you im­plic­itly or ex­plic­itly pin­point the log­i­cal func­tion that gives judg­ments of ar­bi­trari­ness their truth-value—pre­sum­ably, re­volv­ing around the pres­ence or ab­sence of jus­tifi­ca­tions—then this log­i­cal func­tion will ob­jec­tively yield that there’s no jus­tifi­ca­tion what­so­ever for max­i­miz­ing pa­per­clips (which is why I’m not go­ing to do it) and hence that Clippy’s de­ci­sion is ar­bi­trary. Con­versely, Clippy finds that there’s no clip­pyflurph for pre­serv­ing life, and hence that it is un­clip­per­iffic. But un­clip­per­iffic­ness isn’t ar­bi­trari­ness any more than the num­ber 17 is a right tri­an­gle; they’re differ­ent log­i­cal en­tities pinned down by differ­ent ax­ioms, and the cor­re­spond­ing judg­ments will have differ­ent se­man­tic con­tent and feel differ­ent. If Clippy is ar­chi­tected to ex­pe­rience that-which-you-call-qualia, Clippy’s feel­ing of clip­pyflurph will be struc­turally differ­ent from the way jus­tifi­ca­tion feels, not just red ver­sus blue, but vi­sion ver­sus sound.

But surely one shouldn’t praise the clip­pyflur­phers rather than the just? I quite agree; and as soon as you nav­i­gate refer­en­tially to the co­her­ent log­i­cal en­tity that is the truth-con­di­tion of should—a func­tion on po­ten­tial ac­tions and fu­ture states—it will agree with you that it’s bet­ter to avoid the ar­bi­trary than the un­clip­per­iffic. Un­for­tu­nately, this log­i­cal fact does not cor­re­spond to the truth-con­di­tion of any mean­ingful propo­si­tion com­puted by Clippy in the course of how it effi­ciently trans­forms the uni­verse into pa­per­clips, in much the same way that right­ness plays no role in that-which-is-max­i­mized by the blind pro­cesses of nat­u­ral se­lec­tion.

Where moral judg­ment is con­cerned, it’s logic all the way down. ALL the way down. Any frame of refer­ence where you’re wor­ried that it’s re­ally no bet­ter to do what’s right then to max­i­mize pa­per­clips… well, that re­ally part has a truth-con­di­tion (or what does the “re­ally” mean?) and as soon as you write out the truth-con­di­tion you’re go­ing to end up with yet an­other or­der­ing over ac­tions or al­gorithms or meta-al­gorithms or some­thing. And since grind­ing up the uni­verse won’t and shouldn’t yield any mi­ni­a­ture ‘>’ to­kens, it must be a log­i­cal or­der­ing. And so what­ever log­i­cal or­der­ing it is you’re wor­ried about, it prob­a­bly does pro­duce ‘life > pa­per­clips’ - but Clippy isn’t com­put­ing that log­i­cal fact any more than your pocket calcu­la­tor is com­put­ing it.

Log­i­cal facts have no power to di­rectly af­fect the uni­verse ex­cept when some part of the uni­verse is com­put­ing them, and moral­ity is (and should be) logic, not physics.

Which is to say:

The old wiz­ard was star­ing at him, a sad look in his eyes. “I sup­pose I do un­der­stand now,” he said quietly.

“Oh?” said Harry. “Un­der­stand what?”

“Volde­mort,” said the old wiz­ard. “I un­der­stand him now at last. Be­cause to be­lieve that the world is truly like that, you must be­lieve there is no jus­tice in it, that it is wo­ven of dark­ness at its core. I asked you why he be­came a mon­ster, and you could give no rea­son. And if I could ask him, I sup­pose, his an­swer would be: Why not?”

They stood there gaz­ing into each other’s eyes, the old wiz­ard in his robes, and the young boy with the light­ning-bolt scar on his fore­head.

“Tell me, Harry,” said the old wiz­ard, “will you be­come a mon­ster?”

“No,” said the boy, an iron cer­tainty in his voice.

“Why not?” said the old wiz­ard.

The young boy stood very straight, his chin raised high and proud, and said: “There is no jus­tice in the laws of Na­ture, Head­mas­ter, no term for fair­ness in the equa­tions of mo­tion. The uni­verse is nei­ther evil, nor good, it sim­ply does not care. The stars don’t care, or the Sun, or the sky. But they don’t have to! We care! There is light in the world, and it is us!

Part of the se­quence Highly Ad­vanced Episte­mol­ogy 101 for Beginners

Next post: “Stan­dard and Non­stan­dard Num­bers

Pre­vi­ous post: “Mixed Refer­ence: The Great Re­duc­tion­ist Pro­ject