Value is Fragile

If I had to pick a sin­gle state­ment that re­lies on more Over­com­ing Bias con­tent I’ve writ­ten than any other, that state­ment would be:

Any Fu­ture not shaped by a goal sys­tem with de­tailed re­li­able in­her­i­tance from hu­man morals and meta­morals, will con­tain al­most noth­ing of worth.

“Well,” says the one, “maybe ac­cord­ing to your provin­cial hu­man val­ues, you wouldn’t like it. But I can eas­ily imag­ine a galac­tic civ­i­liza­tion full of agents who are noth­ing like you, yet find great value and in­ter­est in their own goals. And that’s fine by me. I’m not so bi­goted as you are. Let the Fu­ture go its own way, with­out try­ing to bind it for­ever to the laugh­ably prim­i­tive prej­u­dices of a pack of four-limbed Squishy Things—”

My friend, I have no prob­lem with the thought of a galac­tic civ­i­liza­tion vastly un­like our own… full of strange be­ings who look noth­ing like me even in their own imag­i­na­tions… pur­su­ing plea­sures and ex­pe­riences I can’t be­gin to em­pathize with… trad­ing in a mar­ket­place of uni­mag­in­able goods… al­ly­ing to pur­sue in­com­pre­hen­si­ble ob­jec­tives… peo­ple whose life-sto­ries I could never un­der­stand.

That’s what the Fu­ture looks like if things go right.

If the chain of in­her­i­tance from hu­man (meta)morals is bro­ken, the Fu­ture does not look like this. It does not end up mag­i­cally, delight­fully in­com­pre­hen­si­ble.

With very high prob­a­bil­ity, it ends up look­ing dull. Pointless. Some­thing whose loss you wouldn’t mourn.

See­ing this as ob­vi­ous, is what re­quires that im­mense amount of back­ground ex­pla­na­tion.

And I’m not go­ing to iter­ate through all the points and wind­ing path­ways of ar­gu­ment here, be­cause that would take us back through 75% of my Over­com­ing Bias posts. Ex­cept to re­mark on how many differ­ent things must be known to con­strain the fi­nal an­swer.

Con­sider the in­cred­ibly im­por­tant hu­man value of “bore­dom”—our de­sire not to do “the same thing” over and over and over again. You can imag­ine a mind that con­tained al­most the whole speci­fi­ca­tion of hu­man value, al­most all the morals and meta­morals, but left out just this one thing -

- and so it spent un­til the end of time, and un­til the farthest reaches of its light cone, re­play­ing a sin­gle highly op­ti­mized ex­pe­rience, over and over and over again.

Or imag­ine a mind that con­tained al­most the whole speci­fi­ca­tion of which sort of feel­ings hu­mans most en­joy—but not the idea that those feel­ings had im­por­tant ex­ter­nal refer­ents. So that the mind just went around feel­ing like it had made an im­por­tant dis­cov­ery, feel­ing it had found the perfect lover, feel­ing it had helped a friend, but not ac­tu­ally do­ing any of those things—hav­ing be­come its own ex­pe­rience ma­chine. And if the mind pur­sued those feel­ings and their refer­ents, it would be a good fu­ture and true; but be­cause this one di­men­sion of value was left out, the fu­ture be­came some­thing dull. Bor­ing and repet­i­tive, be­cause al­though this mind felt that it was en­coun­ter­ing ex­pe­riences of in­cred­ible nov­elty, this feel­ing was in no wise true.

Or the con­verse prob­lem—an agent that con­tains all the as­pects of hu­man value, ex­cept the val­u­a­tion of sub­jec­tive ex­pe­rience. So that the re­sult is a non­sen­tient op­ti­mizer that goes around mak­ing gen­uine dis­cov­er­ies, but the dis­cov­er­ies are not sa­vored and en­joyed, be­cause there is no one there to do so. This, I ad­mit, I don’t quite know to be pos­si­ble. Con­scious­ness does still con­fuse me to some ex­tent. But a uni­verse with no one to bear wit­ness to it, might as well not be.

Value isn’t just com­pli­cated, it’s frag­ile. There is more than one di­men­sion of hu­man value, where if just that one thing is lost, the Fu­ture be­comes null. A sin­gle blow and all value shat­ters. Not ev­ery sin­gle blow will shat­ter all value—but more than one pos­si­ble “sin­gle blow” will do so.

And then there are the long defenses of this propo­si­tion, which re­lies on 75% of my Over­com­ing Bias posts, so that it would be more than one day’s work to sum­ma­rize all of it. Maybe some other week. There’s so many branches I’ve seen that dis­cus­sion tree go down.

After all—a mind shouldn’t just go around hav­ing the same ex­pe­rience over and over and over again. Surely no su­per­in­tel­li­gence would be so grossly mis­taken about the cor­rect ac­tion?

Why would any su­per­mind want some­thing so in­her­ently worth­less as the feel­ing of dis­cov­ery with­out any real dis­cov­er­ies? Even if that were its util­ity func­tion, wouldn’t it just no­tice that its util­ity func­tion was wrong, and rewrite it? It’s got free will, right?

Surely, at least bore­dom has to be a uni­ver­sal value. It evolved in hu­mans be­cause it’s valuable, right? So any mind that doesn’t share our dis­like of rep­e­ti­tion, will fail to thrive in the uni­verse and be elimi­nated...

If you are fa­mil­iar with the differ­ence be­tween in­stru­men­tal val­ues and ter­mi­nal val­ues, and fa­mil­iar with the stu­pidity of nat­u­ral se­lec­tion, and you un­der­stand how this stu­pidity man­i­fests in the differ­ence be­tween ex­e­cut­ing adap­ta­tions ver­sus max­i­miz­ing fit­ness, and you know this turned in­stru­men­tal sub­goals of re­pro­duc­tion into de­con­tex­tu­al­ized un­con­di­tional emo­tions...

...and you’re fa­mil­iar with how the trade­off be­tween ex­plo­ra­tion and ex­ploita­tion works in Ar­tifi­cial In­tel­li­gence...

...then you might be able to see that the hu­man form of bore­dom that de­mands a steady trickle of nov­elty for its own sake, isn’t a grand uni­ver­sal, but just a par­tic­u­lar al­gorithm that evolu­tion coughed out into us. And you might be able to see how the vast ma­jor­ity of pos­si­ble ex­pected util­ity max­i­miz­ers, would only en­gage in just so much effi­cient ex­plo­ra­tion, and spend most of its time ex­ploit­ing the best al­ter­na­tive found so far, over and over and over.

That’s a lot of back­ground knowl­edge, though.

And so on and so on and so on through 75% of my posts on Over­com­ing Bias, and many chains of fal­lacy and counter-ex­pla­na­tion. Some week I may try to write up the whole di­a­gram. But for now I’m go­ing to as­sume that you’ve read the ar­gu­ments, and just de­liver the con­clu­sion:

We can’t re­lax our grip on the fu­ture—let go of the steer­ing wheel—and still end up with any­thing of value.

And those who think we can -

- they’re try­ing to be cos­mopoli­tan. I un­der­stand that. I read those same sci­ence fic­tion books as a kid: The provin­cial villains who en­slave aliens for the crime of not look­ing just like hu­mans. The provin­cial villains who en­slave hel­pless AIs in du­rance vile on the as­sump­tion that sili­con can’t be sen­tient. And the cos­mopoli­tan heroes who un­der­stand that minds don’t have to be just like us to be em­braced as valuable -

I read those books. I once be­lieved them. But the beauty that jumps out of one box, is not jump­ing out of all boxes. (This be­ing the moral of the se­quence on Lawful Creativity.) If you leave be­hind all or­der, what is left is not the perfect an­swer, what is left is perfect noise. Some­times you have to aban­don an old de­sign rule to build a bet­ter mouse­trap, but that’s not the same as giv­ing up all de­sign rules and col­lect­ing wood shav­ings into a heap, with ev­ery pat­tern of wood as good as any other. The old rule is always aban­doned at the be­hest of some higher rule, some higher crite­rion of value that gov­erns.

If you loose the grip of hu­man morals and meta­morals—the re­sult is not mys­te­ri­ous and alien and beau­tiful by the stan­dards of hu­man value. It is moral noise, a uni­verse tiled with pa­per­clips. To change away from hu­man morals in the di­rec­tion of im­prove­ment rather than en­tropy, re­quires a crite­rion of im­prove­ment; and that crite­rion would be phys­i­cally rep­re­sented in our brains, and our brains alone.

Re­lax the grip of hu­man value upon the uni­verse, and it will end up se­ri­ously val­ue­less. Not, strange and alien and won­der­ful, shock­ing and ter­rify­ing and beau­tiful be­yond all hu­man imag­i­na­tion. Just, tiled with pa­per­clips.

It’s only some hu­mans, you see, who have this idea of em­brac­ing man­i­fold va­ri­eties of mind—of want­ing the Fu­ture to be some­thing greater than the past—of be­ing not bound to our past selves—of try­ing to change and move for­ward.

A pa­per­clip max­i­mizer just chooses whichever ac­tion leads to the great­est num­ber of pa­per­clips.

No free lunch. You want a won­der­ful and mys­te­ri­ous uni­verse? That’s your value. You work to cre­ate that value. Let that value ex­ert its force through you who rep­re­sents it, let it make de­ci­sions in you to shape the fu­ture. And maybe you shall in­deed ob­tain a won­der­ful and mys­te­ri­ous uni­verse.

No free lunch. Valuable things ap­pear be­cause a goal sys­tem that val­ues them takes ac­tion to cre­ate them. Paper­clips don’t ma­te­ri­al­ize from nowhere for a pa­per­clip max­i­mizer. And a won­der­fully alien and mys­te­ri­ous Fu­ture will not ma­te­ri­al­ize from nowhere for us hu­mans, if our val­ues that pre­fer it are phys­i­cally obliter­ated—or even dis­turbed in the wrong di­men­sion. Then there is noth­ing left in the uni­verse that works to make the uni­verse valuable.

You do have val­ues, even when you’re try­ing to be “cos­mopoli­tan”, try­ing to dis­play a prop­erly vir­tu­ous ap­pre­ci­a­tion of alien minds. Your val­ues are then faded fur­ther into the in­visi­ble back­ground—they are less ob­vi­ously hu­man. Your brain prob­a­bly won’t even gen­er­ate an al­ter­na­tive so awful that it would wake you up, make you say “No! Some­thing went wrong!” even at your most cos­mopoli­tan. E.g. “a non­sen­tient op­ti­mizer ab­sorbs all mat­ter in its fu­ture light cone and tiles the uni­verse with pa­per­clips”. You’ll just imag­ine strange alien wor­lds to ap­pre­ci­ate.

Try­ing to be “cos­mopoli­tan”—to be a cit­i­zen of the cos­mos—just strips off a sur­face ve­neer of goals that seem ob­vi­ously “hu­man”.

But if you wouldn’t like the Fu­ture tiled over with pa­per­clips, and you would pre­fer a civ­i­liza­tion of...

...sen­tient be­ings...

...with en­joy­able ex­pe­riences...

...that aren’t the same ex­pe­rience over and over again...

...and are bound to some­thing be­sides just be­ing a se­quence of in­ter­nal plea­surable feel­ings...

...learn­ing, dis­cov­er­ing, freely choos­ing...

...well, I’ve just been through the posts on Fun The­ory that went into some of the hid­den de­tails on those short English words.

Values that you might praise as cos­mopoli­tan or uni­ver­sal or fun­da­men­tal or ob­vi­ous com­mon sense, are rep­re­sented in your brain just as much as those val­ues that you might dis­miss as merely hu­man. Those val­ues come of the long his­tory of hu­man­ity, and the morally mirac­u­lous stu­pidity of evolu­tion that cre­ated us. (And once I fi­nally came to that re­al­iza­tion, I felt less ashamed of val­ues that seemed ‘provin­cial’ - but that’s an­other mat­ter.)

Th­ese val­ues do not emerge in all pos­si­ble minds. They will not ap­pear from nowhere to re­buke and re­voke the util­ity func­tion of an ex­pected pa­per­clip max­i­mizer.

Touch too hard in the wrong di­men­sion, and the phys­i­cal rep­re­sen­ta­tion of those val­ues will shat­ter—and not come back, for there will be noth­ing left to want to bring it back.

And the refer­ent of those val­ues—a worth­while uni­verse—would no longer have any phys­i­cal rea­son to come into be­ing.

Let go of the steer­ing wheel, and the Fu­ture crashes.