Moral Er­ror and Moral Disagreement

Fol­lowup to: In­sep­ar­ably Right, Sort­ing Pebbles Into Cor­rect Heaps

Richard Chap­pell, a pro, writes:

“When Bob says “Abor­tion is wrong”, and Sally says, “No it isn’t”, they are dis­agree­ing with each other.

I don’t see how Eliezer can ac­com­mod­ate this. On his ac­count, what Bob as­ser­ted is true iff abor­tion is pro­hib­ited by the mor­al­ity_Bob norms. How can Sally dis­agree? There’s no dis­put­ing (we may sup­pose) that abor­tion is in­deed pro­hib­ited by mor­al­ity_Bob...

Since there is moral dis­agree­ment, whatever Eliezer pur­ports to be ana­lys­ing here, it is not mor­al­ity.”

The phe­nom­ena of moral dis­agree­ment, moral er­ror, and moral pro­gress, on ter­minal val­ues, are the primary drivers be­hind my metaethics. Think of how simple Friendly AI would be if there were no moral dis­agree­ments, moral er­rors, or moral pro­gress!

Richard claims, “There’s no dis­put­ing (we may sup­pose) that abor­tion is in­deed pro­hib­ited by mor­al­ity_Bob.”

We may not sup­pose, and there is dis­put­ing. Bob does not have dir­ect, un­me­di­ated, veridical ac­cess to the out­put of his own mor­al­ity.

I tried to de­scribe mor­al­ity as a “com­pu­ta­tion”. In ret­ro­spect, I don’t think this is func­tion­ing as the Word of Power that I thought I was emit­ting.

Let us read, for “com­pu­ta­tion”, “ideal­ized ab­stract dy­namic”—maybe that will be a more com­fort­able la­bel to ap­ply to mor­al­ity.

Even so, I would have thought it ob­vi­ous that com­pu­ta­tions may be the sub­jects of mys­tery and er­ror. Maybe it’s not as ob­vi­ous out­side com­puter sci­ence?

Disagree­ment has two pre­requis­ites: the pos­sib­il­ity of agree­ment and the pos­sib­il­ity of er­ror. For two people to agree on some­thing, there must be some­thing they are agree­ing about, a ref­er­ent held in com­mon. And it must be pos­sible for an “er­ror” to take place, a con­flict between “P” in the map and not-P in the ter­rit­ory. Where these two pre­requis­ites are present, Sally can say to Bob: “That thing we were just both talk­ing about—you are in er­ror about it.”

Richard’s ob­jec­tion would seem in the first place to rule out the pos­sib­il­ity of moral er­ror, from which he de­rives the im­possib­il­ity of moral agree­ment.

So: does my metaethics rule out moral er­ror? Is there no dis­put­ing that abor­tion is in­deed pro­hib­ited by mor­al­ity_Bob?

This is such a strange idea that I find my­self won­der­ing what the heck Richard could be think­ing. My best guess is that Richard, per­haps hav­ing not read all the posts in this se­quence, is tak­ing my no­tion of mor­al­ity_Bob to refer to a flat, static list of valu­ations ex­pli­citly as­ser­ted by Bob. “Abor­tion is wrong” would be on Bob’s list, and there would be no dis­put­ing that.

But on the con­trary, I con­ceive of mor­al­ity_Bob as some­thing that un­folds into Bob’s mor­al­ity—like the way one can de­scribe in 6 states and 2 sym­bols a Tur­ing ma­chine that will write 4.640 × 101439 1s to its tape be­fore halt­ing.

So mor­al­ity_Bob refers to a com­pact fol­ded spe­cific­a­tion, and not a flat list of out­puts. But still, how could Bob be wrong about the out­put of his own mor­al­ity?

In man­i­fold ob­vi­ous and non-ob­vi­ous ways:

Bob could be em­pir­ic­ally mis­taken about the state of fetuses, per­haps be­liev­ing fetuses to be aware of the out­side world. (Cor­rect­ing this might change Bob’s in­stru­mental val­ues but not ter­minal val­ues.)

Bob could have formed his be­liefs about what con­sti­tuted “per­son­hood” in the pres­ence of con­fu­sion about the nature of con­scious­ness, so that if Bob were fully in­formed about con­scious­ness, Bob would not have been temp­ted to talk about “the be­gin­ning of life” or “the hu­man kind” in or­der to define per­son­hood. (This changes Bob’s ex­pressed ter­minal val­ues; af­ter­ward he will state dif­fer­ent gen­eral rules about what sort of phys­ical things are ends in them­selves.)

So those are the ob­vi­ous moral er­rors—in­stru­mental er­rors driven by em­pir­ical mis­takes; and er­ro­neous gen­er­al­iz­a­tions about ter­minal val­ues, driven by fail­ure to con­sider moral ar­gu­ments that are valid but hard to find in the search space.

Then there are less ob­vi­ous sources of moral er­ror: Bob could have a list of mind-in­flu­en­cing con­sid­er­a­tions that he con­siders mor­ally valid, and a list of other mind-in­flu­en­cing con­sid­er­a­tions that Bob con­siders mor­ally in­valid. Maybe Bob was raised a Chris­tian and now con­siders that cul­tural in­flu­ence to be in­valid. But, un­known to Bob, when he weighs up his val­ues for and against abor­tion, the in­flu­ence of his Chris­tian up­bring­ing comes in and dis­torts his sum­ming of value-weights. So Bob be­lieves that the out­put of his cur­rent val­id­ated moral be­liefs is to pro­hibit abor­tion, but ac­tu­ally this is a leftover of his child­hood and not the out­put of those be­liefs at all.

(Note that Robin Han­son and I seem to dis­agree, in a case like this, as to ex­actly what de­gree we should take Bob’s word about what his mor­als are.)

Or Bob could be­lieve that the word of God de­term­ines moral truth and that God has pro­hib­ited abor­tion in the Bible. Then Bob is mak­ing metaethical mis­takes, caus­ing his mind to mal­func­tion in a highly gen­eral way, and add moral gen­er­al­iz­a­tions to his be­lief pool, which he would not do if veridical know­ledge of the uni­verse des­troyed his cur­rent and in­co­her­ent metaethics.

Now let us turn to the dis­agree­ment between Sally and Bob.

You could sug­gest that Sally is say­ing to Bob, “Abor­tion is al­lowed by mor­al­ity_Bob”, but that seems a bit over­sim­pli­fied; it is not psy­cho­lo­gic­ally or mor­ally real­istic.

If Sally and Bob were un­real­ist­ic­ally soph­ist­ic­ated, they might de­scribe their dis­pute as fol­lows:

Bob: “Abor­tion is wrong.”

Sally: “Do you think that this is some­thing of which most hu­mans ought to be per­suad­able?”

Bob: “Yes, I do. Do you think abor­tion is right?”

Sally: “Yes, I do. And I don’t think that’s be­cause I’m a psy­cho­path by com­mon hu­man stand­ards. I think most hu­mans would come to agree with me, if they knew the facts I knew, and heard the same moral ar­gu­ments I’ve heard.”

Bob: “I think, then, that we must have a moral dis­agree­ment: since we both be­lieve ourselves to be a shared moral frame of ref­er­ence on this is­sue, and yet our moral in­tu­itions say dif­fer­ent things to us.”

Sally: “Well, it is not lo­gic­ally ne­ces­sary that we have a genu­ine dis­agree­ment. We might be mis­taken in be­liev­ing ourselves to mean the same thing by the words right and wrong, since neither of us can in­tro­spect­ively re­port our own moral ref­er­ence frames or un­fold them fully.”

Bob: “But if the mean­ing is sim­ilar up to the third decimal place, or suf­fi­ciently sim­ilar in some re­spects that it ought to be de­liv­er­ing sim­ilar an­swers on this par­tic­u­lar is­sue, then, even if our mor­al­it­ies are not in-prin­ciple identical, I would not hes­it­ate to in­voke the in­tu­itions for transper­sonal mor­al­ity.”

Sally: “I agree. Until proven oth­er­wise, I am in­clined to talk about this ques­tion as if it is the same ques­tion unto us.”

Bob: “So I say ‘Abor­tion is wrong’ without fur­ther qual­i­fic­a­tion or spe­cial­iz­a­tion on what wrong means unto me.”

Sally: “And I think that abor­tion is right. We have a dis­agree­ment, then, and at least one of us must be mis­taken.”

Bob: “Un­less we’re ac­tu­ally choos­ing dif­fer­ently be­cause of in-prin­ciple un­resolv­able dif­fer­ences in our moral frame of ref­er­ence, as if one of us were a pa­per­clip max­im­izer. In that case, we would be mu­tu­ally mis­taken in our be­lief that when we talk about do­ing what is right, we mean the same thing by right. We would agree that we have a dis­agree­ment, but we would both be wrong.”

Now, this is not ex­actly what most people are ex­pli­citly think­ing when they en­gage in a moral dis­pute—but it is how I would cash out and nat­ur­al­ize their in­tu­itions about transper­sonal mor­al­ity.

Richard also says, “Since there is moral dis­agree­ment...” This seems like a prime case of what I call na­ive philo­soph­ical real­ism—the be­lief that philo­soph­ical in­tu­itions are dir­ect un­me­di­ated veridical pass­ports to philo­soph­ical truth.

It so hap­pens that I agree that there is such a thing as moral dis­agree­ment. To­mor­row I will en­deavor to jus­tify, in fuller de­tail, how this state­ment can pos­sibly make sense in a re­duc­tion­istic nat­ural uni­verse. So I am not dis­put­ing this par­tic­u­lar pro­pos­i­tion. But I note, in passing, that Richard can­not jus­ti­fi­ably as­sert the ex­ist­ence of moral dis­agree­ment as an ir­re­fut­able premise for dis­cus­sion, though he could con­sider it as an ap­par­ent datum. You can­not take as ir­re­fut­able premises, things that you have not ex­plained ex­actly; for then what is it that is cer­tain to be true?

I can­not help but note the re­semb­lance to Richard’s as­sump­tion that “there’s no dis­put­ing” that abor­tion is in­deed pro­hib­ited by mor­al­ity_Bob—the as­sump­tion that Bob has dir­ect veridical un­me­di­ated ac­cess to the fi­nal un­fol­ded out­put of his own mor­al­ity.

Per­haps Richard means that we could sup­pose that abor­tion is in­deed pro­hib­ited by mor­al­ity_Bob, and al­lowed by mor­al­ity_Sally, there be­ing at least two pos­sible minds for whom this would be true. Then the two minds might be mis­taken about be­liev­ing them­selves to dis­agree. Ac­tu­ally they would simply be dir­ec­ted by dif­fer­ent al­gorithms.

You can­not have a dis­agree­ment about which al­gorithm should dir­ect your ac­tions, without first hav­ing the same mean­ing of should—and no mat­ter how you try to phrase this in terms of “what ought to dir­ect your ac­tions” or “right ac­tions” or “cor­rect heaps of pebbles”, in the end you will be left with the em­pir­ical fact that it is pos­sible to con­struct minds dir­ec­ted by any co­her­ent util­ity func­tion.

When a pa­per­clip max­im­izer and a pen­cil max­im­izer do dif­fer­ent things, they are not dis­agree­ing about any­thing, they are just dif­fer­ent op­tim­iz­a­tion pro­cesses. You can­not de­tach should-ness from any spe­cific cri­terion of should-ness and be left with a pure empty should-ness that the pa­per­clip max­im­izer and pen­cil max­im­izer can be said to dis­agree about—un­less you cover “dis­agree­ment” to in­clude dif­fer­ences where two agents have noth­ing to say to each other.

But this would be an ex­treme po­s­i­tion to take with re­spect to your fel­low hu­mans, and I re­com­mend against do­ing so. Even a psy­cho­path would still be in a com­mon moral ref­er­ence frame with you, if, fully in­formed, they would de­cide to take a pill that would make them non-psy­cho­paths. If you told me that my abil­ity to care about other people was neur­o­lo­gic­ally dam­aged, and you offered me a pill to fix it, I would take it. Now, per­haps some psy­cho­paths would not be per­suad­able in-prin­ciple to take the pill that would, by our stand­ards, “fix” them. But I note the pos­sib­il­ity to em­phas­ize what an ex­treme state­ment it is to say of someone:

“We have noth­ing to ar­gue about, we are only dif­fer­ent op­tim­iz­a­tion pro­cesses.”

That should be re­served for pa­per­clip max­im­izers, not used against hu­mans whose ar­gu­ments you don’t like.

Part of The Metaethics Sequence

Next post: “Ab­strac­ted Ideal­ized Dy­nam­ics

Pre­vi­ous post: “Sort­ing Pebbles Into Cor­rect Heaps