# What makes us think _any_ of our terminal values aren’t based on a misunderstanding of reality?

Let’s say Bob’s ter­mi­nal value is to travel back in time and ride a dinosaur.

It is in­stru­men­tally ra­tio­nal for Bob to study physics so he can learn how to build a time ma­chine. As he learns more physics, Bob re­al­izes that his ter­mi­nal value is not only ut­terly im­pos­si­ble but mean­ingless. By defi­ni­tion, some­one in Bob’s past rid­ing a dinosaur is not a fu­ture evolu­tion of the pre­sent Bob.

There are a num­ber of ways to cre­ate the sub­jec­tive ex­pe­rience of hav­ing gone into the past and rid­den a dinosaur. But to Bob, it’s not the same be­cause he wanted both the sub­jec­tive ex­pe­rience and the knowl­edge that it cor­re­sponded to ob­jec­tive fact. Without the lat­ter, he might as well have just watched a movie or played a video game.

So if we took the origi­nal, in­no­cent-of-physics Bob and some­how calcu­lated his co­her­ent ex­trap­o­lated vo­li­tion, we would end up with a Bob who has given up on time travel. The origi­nal Bob would not want to be this Bob.

But, how do we know that _any­thing_ we value won’t similarly dis­solve un­der suffi­ciently thor­ough de­con­struc­tion? Let’s sup­pose for a minute that all “hu­man val­ues” are dan­gling units; that ev­ery­thing we want is as pos­si­ble and makes as much sense as want­ing to hear the sound of blue or taste the fla­vor of a prime num­ber. What is the ra­tio­nal course of ac­tion in such a situ­a­tion?

PS: If your re­sponse re­sem­bles “keep at­tempt­ing to XXX any­way”, please ex­plain what priv­ileges XXX over any num­ber of other al­ter­na­tives other than your cur­rent prefer­ence. Are you us­ing some kind of pre-com­mit­ment strat­egy to a sub­set of your cur­rent goals? Do you now wish you had used the same strat­egy to pre­com­mit to goals you had when you were a tod­dler?

I, for one, have “ter­mi­nal value” for trav­el­ing back in time and rid­ing a dinosaur, in the sense that wor­lds con­sis­tent with that event are ranked above most oth­ers. Now, of course, the re­al­iza­tion of that par­tic­u­lar goal is im­pos­si­ble, but pos­si­bil­ity is or­thog­o­nal to prefer­ence.

The fact is, most things are im­pos­si­ble, but there’s noth­ing wrong with hav­ing a gen­eral prefer­ence or­der­ing over a su­per­set of the set of phys­i­cally pos­si­ble wor­lds. Like­wise, my prob­a­bil­ity dis­tri­bu­tions are over a su­per­set of the ac­tu­ally phys­i­cally pos­si­ble out­comes.

When all the im­pos­si­ble things get elimi­nated and we move on like good ra­tio­nal­ists, there are still choices to be made, and some things are still bet­ter than oth­ers. If I have to choose be­tween a uni­verse con­tain­ing a bil­ion pa­per­clips and a uni­verse con­tain­ing a sin­gle frozen dinosaur, my prefer­ence for ice cream over dirt is ir­rele­vant, but I can still make a choice, and can still have a prefer­ence for the dinosaur (or the pa­per­clips, what­ever I hap­pen to think is best).

I ac­tu­ally don’t know what you even mean by my val­ues dis­solv­ing, though. Some­times I learn things that change how I would make choices. Maybe some day I will learn some­thing that turns me into a nihilist such that I would pre­fer to wail about the mean­ingless­ness of all my de­sires, but it seems un­likely.

In con­trast to this com­ment’s sister com­ment, I don’t think this ad­dresses the ques­tion. In­stead, it de­scribes what it is like when the con­text for the ques­tion isn’t the case.

Ac­tu­ally, the con­verse of the an­swer pro­vides some sug­ges­tion as to what it would be like if all our our val­ues were found to be non­sen­si­cal...

It would mean we would find that we are in­differ­ent to all choices—with the im­pos­si­ble elimi­nated, we are in­differ­ent to all the choices pos­si­ble.

We might find that we keep on mak­ing mean­ingless choices out of some­thing a bit stronger than ‘habit’ (which is how I judge the uni­verse we’re in) or we have the abil­ity to ra­tio­nally up­date our in­stru­men­tal val­ues in the con­text of our voided ter­mi­nal val­ues (for ex­am­ple, if we were able to edit our pro­grams) so that we would af­ter all not bother make any choices.

This is re­ally not so far fetched, and it is not too difficult to come up with some ex­am­ples. Sup­pose a per­son had a ter­mi­nal goal to eat healthy. Each morn­ing they make choices be­tween eggs and oat­meal, etc. And then they dis­cover they are ac­tu­ally a robot who draws en­ergy from the en­vi­ron­ment au­to­mat­i­cally, and af­ter all it is not nec­es­sary to eat. If all they cared about was to eat healthily, to op­ti­mize their phys­i­cal well-be­ing, if they then dis­cov­ered there was no con­nec­tion be­tween eat­ing and health, they should lose all in­ter­est in any choices about food. They would have no prefer­ence to eat, or not to eat, or about what they ate. (Un­less you re­fer to an­other, new ter­mi­nal value.)

Another ex­am­ple is that a per­son cares very much about their fam­ily, and must de­cide be­tween spend­ing money on an op­er­a­tion for their child or for food for their fam­ily. Then the per­son wakes up and finds that the en­tire sce­nario was just a dream, they don’t have a fam­ily. Even if they think about it a lit­tle longer and might de­cide, while awake, what would have been the best ac­tion to take, they no longer have much prefer­ence (if any) about what ac­tion they chose in the dream. In fact, any prefer­ence would stem from lin­ger­ing feel­ings that the dream was real, or mat­tered, in some as­pect, which is just to show the limi­ta­tions of this ex­am­ple.

I agree and think that this part sums up a good re­sponse to the above ques­tion.

• But, how do we know that any­thing we value won’t similarly dis­solve un­der suffi­ciently thor­ough de­con­struc­tion?

Ex­pe­rience.

I was once a the­ist. I be­lieved that peo­ple were on­tolog­i­cally fun­da­men­tal, and that there was a true moral­ity writ­ten in the sky, and an om­ni­scient de­ity would tell you what to do if you asked. Now I don’t. My val­ues did change a lit­tle, in that they’re no longer based on what other peo­ple tell me is good so I don’t think ho­mo­sex­u­al­ity is bad and stuff like that, but it wasn’t a sig­nifi­cant change.

The only part that I think did change be­cause of that was just that I no longer be­lieved that cer­tain peo­ple were a good au­thor­ity on ethics. Had I not be­lieved God would tell us what’s right, I’m not sure there’d have been any change at all.

Learn­ing more physics is a com­par­a­tively small change, and I’d ex­pect it to cor­re­spond to a tiny change in val­ues.

In re­gards to your Bob ex­am­ple, if I had his val­ues, I’d ex­pect that af­ter learn­ing that some­one in the past is by defi­ni­tion not a fu­ture evolu­tion of me, I’d change my defi­ni­tion to some­thing closer to the “naive” defi­ni­tion, and ig­nore any jumps in time so long as the peo­ple stay the same when de­cid­ing of some­one is a fu­ture evolu­tion of me. If I then learn about time­less quan­tum physics and re­al­ize there’s no such thing as the past any­way, and cer­tainly not pasts that lead to par­tic­u­lar fu­tures, I’d set­tle for a world with a lower en­tropy, in which a rel­a­tively high num­ber of Feyn­man paths reach here.

• If I then learn about time­less quan­tum physics and re­al­ize there’s no such thing as the past any­way, and cer­tainly not pasts that lead to par­tic­u­lar fu­tures, I’d set­tle for a world with a lower en­tropy, in which a rel­a­tively high num­ber of Feyn­man paths reach here.

Funny you should say that. I, for one, have the ter­mi­nal value of con­tinued per­sonal ex­is­tence (a.k.a. be­ing al­ive). On LW I’m learn­ing that con­ti­nu­ity, per­son­hood, and ex­is­tence might well be illu­sions. If that is the case, my efforts to find ways to sur­vive amount to ex­tend­ing some­thing that isn’t there in the first place

Of course there’s the high prob­a­bil­ity that we’re do­ing the philo­soph­i­cal equiv­a­lent of di­vid­ing by zero some­where among our many nested ex­trap­o­la­tions.

But let’s say con­scious­ness re­ally is an illu­sion. Maybe the take-home les­son is that our goals all live at a much more su­perfi­cial level than we are ca­pa­ble of prob­ing. Not that re­duc­tion­ism “robs” us of our val­ues or any­thing like that… but it may mean that can­not ex­ist an in­stru­men­tally ra­tio­nal course of ac­tion that is also perfectly epistem­i­cally ra­tio­nal. That be­ing less wrong past some thresh­old will not help us set bet­ter goals for our­selves, only get bet­ter at pur­su­ing goals we pre-com­mit­ted to pur­su­ing.

• What do you mean when you say con­scious­ness may be an illu­sion? It’s hap­pen­ing to you, isn’t it? What other proof do you need? What would a world look like where con­scious­ness is an illu­sion, vs. one where it isn’t?

• Iden­ti­cal. There­fore con­scious­ness adds com­plex­ity with­out ac­tu­ally be­ing nec­es­sary for ex­plain­ing any­thing. There­fore, the pre­sump­tion is that we are all philo­soph­i­cal zom­bies (but think we’re not).

• Okay, so what cre­ates the feel­ing of con­scious­ness in those philo­soph­i­cal zom­bies? Can we gen­er­ate more of those cir­cum­stances which nat­u­rally cre­ate that feel­ing?

If my life is “ul­ti­mately” an illu­sion, how can I make this illu­sion last as long as pos­si­ble?

• Eliezer:

[I]n a dis­cus­sion of the cur­rent state of ev­i­dence for whether the uni­verse is spa­tially finite or spa­tially in­finite,[...] James D. Miller chided Robin Han­son: ‘Robin, you are suffer­ing from over­con­fi­dence bias in as­sum­ing that the uni­verse ex­ists. Surely there is some chance that the uni­verse is of size zero.’

To which I replied: ‘James, if the uni­verse doesn’t ex­ist, it would still be nice to know whether it’s an in­finite or a finite uni­verse that doesn’t ex­ist.’

Ha! You think pul­ling that old ‘uni­verse doesn’t ex­ist’ trick will stop me? It won’t even slow me down!

It’s not that I’m rul­ing out the pos­si­bil­ity that the uni­verse doesn’t ex­ist. It’s just that, even if noth­ing ex­ists, I still want to un­der­stand the noth­ing as best I can. My cu­ri­os­ity doesn’t sud­denly go away just be­cause there’s no re­al­ity, you know!

• We are all philo­soph­i­cal zom­bies, but think we’re not? We’re all X, but think we’re Y? What’s the differ­ence be­tween X and Y? What would our sub­jec­tive ex­pe­rience look like if we were ac­tu­ally Y, in­stead of just think­ing we’re Y? Un­less you can point to some­thing, then we can safely con­clude that you’re talk­ing with­out a mean­ing.

• I’m try­ing to think of what kind of zom­bies there could be be­sides philo­soph­i­cal ones.

Episte­molog­i­cal zom­bie: My brain has ex­actly the same state, all the neu­rons in all the same places, and like­wise the rest of the uni­verse, but my map doesn’t pos­sess any ‘truth’ or ‘ac­cu­racy’.

On­tolog­i­cal zom­bie: All the atoms are in all the same places but they don’t ex­ist.

Ex­is­ten­tial zom­bie: All the atoms are in all the same places but they don’t mean any­thing.

Causal zom­bie: So far as any­one can tell, my brain is do­ing ex­actly the same things, but only by co­in­ci­dence and not be­cause it fol­lows from the laws of physics.

Math­e­mat­i­cal zom­bie: Just like me only it doesn’t run on math.

Log­i­cal zom­bie: I got nothin’.

Con­ceiv­abil­ity zom­bie: It’s ex­actly like me but it lacks the prop­erty of con­ceiv­abil­ity.

• Löwen­heim–Skolem zom­bie: Makes state­ments that are word-for-word iden­ti­cal to the ones that you make about un­countable sets, and for the same causal rea­sons (namely, be­cause you both im­ple­ment the in­fer­ence rules of ZF in the same way), but its state­ments aren’t about ac­tu­ally un­countable sets, be­cause it lives in a countable model of ZF.

• Your causal zom­bie re­minds me of Leib­niz’s pre-es­tab­lished har­mony.

• Ex­is­ten­tial zom­bie: All the atoms are in all the same places but they don’t mean any­thing.

Causal zom­bie: So far as any­one can tell, my brain is do­ing ex­actly the same things, but only by co­in­ci­dence and not be­cause it fol­lows from the laws of physics.

Oddly enough, the other day I ran into some­one who ap­pears to liter­ally be­lieve a com­bi­na­tion of these two.

• The elimi­na­tivist re­sponds: The world would look the same to me (a com­plex brain pro­cess) if du­al­ism were true. But it would not look the same to the im­ma­te­rial ghost pos­sess­ing me, and we could write a com­puter pro­gram that simu­lates an epiphe­nom­e­nal uni­verse, i.e., one where ev­ery brain causally pro­duces a ghost that has no effects of its own. So du­al­ism is mean­ingful and false, not mean­ingless.

The du­al­ist re­sponds in turn: I agree that those two sce­nar­ios make sense. How­ever, I dis­agree about which of those pos­si­ble wor­lds the ev­i­dence sug­gests is our world. And I dis­agree about what sort of agent we are — ex­pe­rience re­veals us to be phe­nom­e­nal con­scious­nesses learn­ing about whether there’s also a phys­i­cal world, not brains in­ves­ti­gat­ing whether there’s also an in­visi­ble epiphe­nom­e­nal spirit-world. The men­tal has epistemic pri­or­ity over the phys­i­cal.

We do have good rea­son to think we are epiphe­nom­e­nal ghosts: Our mo­ment-to-mo­ment ex­pe­rience of things like that (os­tend­ing a patch of red­ness in my vi­sual field) in­di­cates that there is some­thing within ex­pe­rience that is not strictly en­tailed by the phys­i­cal facts. This cat­e­gory of ex­pe­ri­en­tial ‘thats’ I as­sign the la­bel ‘phe­nom­e­nal con­scious­ness’ as a use­ful short­hand, but the ev­i­dence for this cat­e­gory is a per­cep­tion-like in­tro­spec­tive ac­quain­tance, not an in­fer­ence from other items of knowl­edge.

You and I agree, elimi­na­tivist, that we can os­tend some­thing about our mo­ment-to-mo­ment in­tro­spec­tive data. For in­stance, we can ges­ture at op­ti­cal illu­sions. I sim­ply in­sist that one of those some­things is epistem­i­cally im­pos­si­ble given phys­i­cal­ism; we couldn’t have such qual­i­ta­tively spe­cific ex­pe­riences as mere ar­range­ments of atoms, though I cer­tainly agree we could have un­con­scious men­tal states that causally suffice for my judg­ments to that effect.

Elimi­na­tivist: Aren’t you giv­ing up the game the mo­ment you con­cede that your judg­ments are just as well pre­dicted by my in­ter­pre­ta­tion of the data as by yours? If your judg­ments are equally prob­a­ble given elimi­na­tivism as given du­al­ism, then elimi­na­tivism wins purely on grounds of par­si­mony.

Dual­ist: But the da­tum, the ex­planan­dum, isn’t my judg­ment. I don’t go ‘Oh, I seem to be judg­ing that I’m ex­pe­rienc­ing red­ness; I’ll con­clude that I am in fact ex­pe­rienc­ing red­ness’. Rather, I go ‘Oh, I seem to be ex­pe­rienc­ing red­ness; I’ll con­clude that I am in fact ex­pe­rienc­ing red­ness’. This ini­tial seem­ing is a per­cep­tion-like ac­cess to a sub­jec­tive field of vi­sion, not some­thing propo­si­tional or oth­er­wise lin­guis­ti­cally struc­tured. And this seem­ing re­ally does in­clude phe­nom­e­nal red­ness, over and above any dis­po­si­tion to lin­guis­ti­cally judge (or be­have at all!) in any spe­cific way.

Elimi­na­tivist: But even those judg­ments are pre­dicted by my the­ory as well. How can you trust in judg­ments of yours that are causally un­cor­re­lated with the truth? If you know that in most pos­si­ble wor­lds where you ar­rive at your cur­rent state of over­all be­lief, you’re wrong about X, then you should con­clude that you are in fact wrong about X. (And there are more pos­si­ble wor­lds where your brain ex­ists than where your brain and epiphe­nom­e­nal ghost ex­ist.)

Dual­ist: Our dis­agree­ment is that I don’t see my epistemic sta­tus as purely causal. On your view, knowl­edge and the ob­ject known are meta­phys­i­cally dis­tinct, with the ob­ject known caus­ing our state of knowl­edge. You con­clude that epistemic states are only re­li­able when they are cor­re­lated with the right ex­trin­sic state of the world.

I agree with you that knowl­edge and the ob­ject known are gen­er­ally dis­tinct, but we should ex­pect an ex­cep­tion to that rule when knowl­edge turns upon it­self, i.e., when the thing we’re aware of is the very fact of aware­ness. In that case, my knowl­edge is not causally, spa­tially, or tem­po­rally sep­a­rated from its ob­ject — at this very mo­ment, with­out any need to ap­peal to a past or pre­sent at all, I can know that I am hav­ing this par­tic­u­lar ex­pe­rience of a text box. I can be wrong in my in­fer­ences, wrong in my spec­u­la­tions about the world out­side my ex­pe­rience; and I can be wrong in my sub­vo­cal­ized judg­ments about my ex­pe­rience; but my ex­pe­rience can’t be wrong about it­self. You can de­sign a map in such a way that it differs from (i.e., mis­rep­re­sents) a ter­ri­tory, but you can’t de­sign a map in such a way that it differs from it­self; the re­la­tion of a map to it­self is one of iden­tity, not of rep­re­sen­ta­tion or causal­ity, and it is the na­ture of my map, as re­vealed by it­self (and to it­self!), that we’re dis­cussing here.

Elimi­na­tivist: I just don’t think that model of in­tro­spec­tion is ten­able, given the his­tory of sci­ence. Maybe your in­tro­spec­tion gives you some ev­i­dence that phys­i­cal­ism is false, but the fre­quency with which we’ve turned out to be wrong about other as­pects of our ex­pe­rience has to do a great deal to un­der­mine your con­fi­dence in your map of the na­ture of your epistemic ac­cess to maps. I’m not hav­ing an ar­gu­ment with your vi­sual field; I’m hav­ing an ar­gu­ment with a lin­guis­tic rea­soner that has formed cer­tain judg­ments about that vi­sual field, and it’s always pos­si­ble that the rea­soner is wrong about its own in­ter­nal states, no mat­ter how ob­vi­ous, man­i­fest, self-ev­i­dent, etc. those states ap­pear.

Dual­ist: A fair point. And I can ap­pre­ci­ate the force of your ar­gu­ment in the ab­stract, when I think about an ar­bi­trary rea­soner from the third per­son. Yet when I at­tend once more to my own stream of con­scious­ness, I be­come just as con­fused all over again. Your philo­soph­i­cal po­si­tion’s ap­peal is in­suffi­cient to over­come the per­cep­tual ob­vi­ous­ness of my own con­scious­ness — and that ob­vi­ous­ness in­cludes the per­cep­tual ob­vi­ous­ness of ir­re­ducibil­ity. I can’t make my­self pre­tend to not be­lieve in some­thing that seems to me so self-ev­i­dent.

Elimi­na­tivist: Then you aren’t try­ing hard enough. For I share your in­tu­itions when I re­flect on my im­me­di­ate ex­pe­riences, yet I’ve suc­cess­fully deferred to sci­ence and philos­o­phy in a way that blocks these sem­blances be­fore they can mu­tate into be­liefs. It can be done.

Dual­ist: It can be done. But should it? From my per­spec­tive, you’ve talked your­self into a lu­natic po­si­tion by rea­son­ing only in im­per­sonal, third-per­son terms. You’ve for­got­ten that the em­piri­cal ev­i­dence in­cludes not only the his­tory of sci­ence, but also your own con­scious states. To me it ap­pears that you’ve fallen into the er­ror of the be­hav­iorists, deny­ing a men­tal state (phe­nom­e­nal con­scious­ness) just be­cause it doesn’t fit neatly into a spe­cific in­vented set of episte­molog­i­cal so­cial stan­dards. No mat­ter how much I’d love to join you in as­sert­ing a the­ory as el­e­gant and sim­ple as phys­i­cal­ism, I can’t bring my­self to do so when it comes at the cost of deny­ing the man­i­fest.

… and the dis­cus­sion con­tinues from there. I don’t think ei­ther po­si­tion is mean­ingless. Claims like ‘noth­ing ex­ists’ aren’t mean­ingless just be­cause agents like us couldn’t con­firm them if they were true; they’re mean­ingful and false. And it’s cer­tainly con­ceiv­able that if the above dis­cus­sion con­tinued long enough, a con­sen­sus could be reached, sim­ply by con­tin­u­ing to de­bate the ex­tent to which sci­ence un­der­mines phe­nomenol­ogy.

• This is an ex­cel­lent and fair sum­mary of the de­bate. I think the one as­pect it leaves out is that elimi­na­tivists differ from du­al­ists in that they have in­ter­nal­ized Quine’s les­sons about how we can always re­vise our con­cep­tual schemes. I elab­o­rated on this long ago in this post at my old blog.

• our con­cepts change and evolve with the growth of sci­en­tific knowl­edge; what is con­ciev­able now may be­come un­con­ciev­able later and vice-versa. Con­cepts are just tools for de­scribing the world, and we can change them and re­form them if we need to. This pic­ture of sci­ence, fa­mil­iar since Quine, is pres­sup­posed by Den­nett, but im­plic­itly re­jected by Chalmers.

I’m pretty con­fi­dent Chalmers would dis­agree with this char­ac­ter­i­za­tion. Chalmers ac­cepts that our con­cepts can change, and he ac­cepts that if zom­bies fall short of ideal con­ceiv­abil­ity — con­ceiv­abil­ity for a mind that perfectly un­der­stands the phe­nom­ena in ques­tion — then du­al­ism will be re­futed. That’s why the Mary’s Room thought ex­per­i­ment is about an ideally ex­trap­o­lated rea­soner. The weak­ness of such a thought ex­per­i­ment is, of course, that we may fail to ac­cu­rately simu­late an ideally ex­trap­o­lated rea­soner; but the strength is that this ideal­iza­tion has meta­phys­i­cal sig­nifi­cance in a way that mere failure of con­tem­po­rary imag­i­na­tion doesn’t.

It may provide a fun­da­men­tal the­ory or a list of them but not a list of fun­da­men­tal en­tities the world is made of with a list of con­tin­gent laws of na­ture hold­ing be­tween them. What an en­tity such like the elec­tro­mag­netic field is, is defined by what laws of na­ture it obeys and there­fore by its re­la­tions with other en­tities.

If con­tem­po­rary sci­ence’s best the­ory posits fun­da­men­tal en­tities, then con­tem­po­rary sci­ence posits fun­da­men­tal en­tities. Science is not across-the-board on­tolog­i­cally ag­nos­tic or defla­tion­ary.

Un­less I’m mi­s­un­der­stand­ing you, your claim that a phys­i­cal the­ory is equiv­a­lent to its Ram­sey sen­tence is a rather differ­ent topic. I think Chalmers would re­spond that al­though this may be true for phys­i­cal the­o­ries at the mo­ment, it’s a con­tin­gent, em­piri­cal truth — we hap­pen to have dis­cov­ered that we don’t need to perform any os­ten­sive acts, for in­stance, in fix­ing the mean­ings of our phys­i­cal terms. If sci­ence dis­cov­ered an ex­cep­tion to this gen­er­al­iza­tion, sci­ence would not per­ish; it would just slightly com­pli­cate the set of lin­guis­tic rit­u­als it cur­rently uses to clar­ify what it’s tak­ing about.

But this shows that all the zom­bie ar­gu­ments are ques­tion-beg­ging, be­cause to carry any force they must as­sume that there is some­thing very spe­cial about con­scious­ness that dis­t­in­guishes it from other sub­jects for sci­ence in the first place.

This isn’t an as­sump­tion. It’s an in­fer­ence from the em­piri­cal char­ac­ter of in­tro­spec­tion. That is, it has a defea­si­ble (quasi-)per­cep­tual ba­sis. Many elimi­na­tivists want it to be the case that du­al­ists are ques­tion-beg­ging when they treat in­tro­spec­tive ev­i­dence as ev­i­dence, but in­tro­spec­tive ev­i­dence is ev­i­dence. Chalmers does not take it as ax­io­matic, prior to ex­am­in­ing the way his stream of con­scious­ness ac­tu­ally looks, that there is a spe­cial class of phe­nom­e­nal con­cepts.

I’m not a du­al­ist, but I don’t think any of Chalmers’ ar­gu­ments are ques­tion-beg­ging. They just aren’t strong enough to re­fute phys­i­cal­ism; phys­i­cal­ism has too many good sup­port­ing ar­gu­ments.

In the sec­ond para­graph you quote, I was not try­ing to make a strong state­ment about sci­en­tific the­o­ries be­ing equiv­a­lent to Ram­sey sen­tences, though I see how that is a nat­u­ral in­ter­pre­ta­tion of it. I meant to sup­port my pre­vi­ous para­graph about the lack of a strong dis­tinc­tion be­tween con­cep­tual im­pli­ca­tions and defi­ni­tions, and con­tin­gent/​nomolog­i­cal laws. For each “fun­da­men­tal law of physics”, there can be one ax­iom­a­ti­za­tion of phys­i­cal the­ory where it is a con­tin­gent re­la­tion be­tween fun­da­men­tal en­tities, and an­other one where it is a defi­ni­tion or con­cep­tual re­la­tion. It is cen­tral for Chalmers’ view­point that the re­la­tion be­tween con­scious­ness and func­tional states is ir­re­ducibly con­tin­gent, but this kind of law would be un­like any other one in physics.

I think you are mix­ing two things here: whether in­tro­spec­tive ev­i­dence is ev­i­dence, which I agree to (e.g., when I “feel like I am see­ing some­thing green”, I very likely am in the state of “see­ing some­thing green”); and whether that “stuff” that when we in­tro­spect we de­scribe with phe­nom­e­nal con­cepts must nec­es­sar­ily be de­scribed with those con­cepts (in­stead of with more so­phis­ti­cated and less in­tu­itive con­cepts, for which the zom­bie/​Mary’s Room/​etc ar­gu­ments would fail).

• Yeah, Chalmers would agree that adding phe­nom­e­nal con­scious­ness would be a very profound break with the sort of the­ory physics cur­rently en­dorses, and not just be­cause it ap­pears an­thro­mor­phiz­ing.

whether that “stuff” that when we in­tro­spect we de­scribe with phe­nom­e­nal con­cepts must nec­es­sar­ily be de­scribed with those con­cepts (in­stead of with more so­phis­ti­cated and less in­tu­itive con­cepts, for which the zom­bie/​Mary’s Room/​etc ar­gu­ments would fail).

I haven’t yet seen a con­cept that my phe­nom­e­nal states ap­pear to fall un­der, that blocks Mary’s Room or Zom­bie World. Not even a schematic, partly-fleshed-out con­cept. (And this is it­self very sur­pris­ing, given phys­i­cal­ism.)

• con­ti­nu­ity, per­son­hood, and ex­is­tence might well be illu­sions. If that is the case, my efforts to find ways to sur­vive amount to ex­tend­ing some­thing that isn’t there in the first place

Can you say more about how you get from “X is an illu­sion” to “X isn’t there in the first place”?

To clar­ify that ques­tion a lit­tle… sup­pose I’m thirsty in the desert, and am pur­su­ing an image of wa­ter, and I even­tu­ally con­clude to my dis­ap­point­ment that it is just a mirage.
I’m do­ing two things here:

• I’m cor­rect­ing an ear­lier false be­lief about the world—my ob­ser­va­tion is not of wa­ter, but of a par­tic­u­lar kind of light-dis­tort­ing sys­tem of heated air.

• I’m mak­ing an im­plicit value judg­ment: I want wa­ter, I don’t want a mirage, which is why I’m dis­ap­pointed. The world is worse than I thought it was.

Those are im­por­tantly differ­ent. If I were, in­stead, a non-thirsty stu­dent of op­tics, I would still cor­rect my be­lief but I might not make the same value judg­ment: I might be delighted to dis­cover that what I’d pre­vi­ously thought was a mere oa­sis is in­stead an in­ter­est­ing mirage!

In the same spirit, sup­pose I dis­cover that con­ti­nu­ity, per­son­hood, and ex­is­tence are illu­sions, when I had pre­vi­ously thought they were some­thing else (what that “some­thing else” is, I don’t re­ally know). So, OK, I cor­rect my ear­lier false be­lief about the world.

There’s still a value judg­ment left to make though… am I dis­ap­pointed to re­al­ize I’m pur­su­ing a mere illu­sion rather than the “some­thing else” I ac­tu­ally wanted? Or am I delighted to dis­cover that I’m pur­su­ing a gen­uine illu­sion rather than an ill-defined “some­thing else”?

Your way of speak­ing seems to take the former for granted. Why is that?

be­ing less wrong past some thresh­old will not help us set bet­ter goals for ourselves

Well, it will, and it won’t. But in the sense I think you mean it, yes, that’s right… it won’t.

Our val­ues are what they are. Be­ing less wrong im­proves our abil­ity to im­ple­ment those val­ues, and our abil­ity to ar­tic­u­late those val­ues, which may in turn cause the val­ues we’re aware of and pur­su­ing to be­come more con­sis­tent, but it doesn’t some­how re­place our val­ues with su­pe­rior val­ues.

• I, for one, have the ter­mi­nal value of con­tinued per­sonal ex­is­tence (a.k.a. be­ing al­ive). On LW I’m learn­ing that con­ti­nu­ity, per­son­hood, and ex­is­tence might well be illu­sions. If that is the case, my efforts to find ways to sur­vive amount to ex­tend­ing some­thing that isn’t there in the first place

I am con­fused about this as well. I think the right thing to do here is to rec­og­nize that there is a lot we don’t know about, e.g. per­son­hood, and that there is a lot we can do to clar­ify our think­ing on per­son­hood. When we aren’t con­fused about this stuff any­more, we can look over it and de­cide what parts we re­ally val­ued; our in­tu­itive idea of per­son­hood clearly de­scribes some­thing, even rec­og­niz­ing that a lot of the ideas of the past are wrong. Note also that we don’t gain any­thing by re­main­ing ig­no­rant (I’m not sure if you’ve re­al­ized this yet).

• ev­ery­thing we want is as pos­si­ble and makes as much sense as want­ing to hear the sound of blue or taste the fla­vor of a prime number

We know it isn’t be­cause most of the time we get what we want. You want choco­late, so you go and buy some and then eat it, and the yummy choco­latey taste you ex­pe­rience is proof that it wasn’t that fu­tile af­ter all for you to want choco­late.

The feel­ing of re­ward we get when we satisfy some of our ter­mi­nal val­ues is what makes us think that they aren’t based on a mi­s­un­der­stand­ing of re­al­ity. So it’s prob­a­bly a pretty good bet to keep want­ing at least the things that have led to re­wards in the past, even if we aren’t as sure about the rest of them, like go­ing back in time.

• What makes us think any of our ter­mi­nal val­ues aren’t based on a mi­s­un­der­stand­ing of re­al­ity?

Much the same thing that makes me think my height isn’t based on a mi­s­un­der­stand­ing of re­al­ity. Differ­ent cat­e­gory. I didn’t un­der­stand my way into hav­ing ter­mi­nal val­ues. Un­der­stand­ing can illu­mi­nate your per­cep­tions of re­al­ity and al­low you to bet­ter grasp what is, but I don’t think that your ter­mi­nal val­ues were gen­er­ated by your un­der­stand­ing. Try­ing to do so is a patholog­i­cal tail bit­ing ex­er­cise.

• I dis­agree with your im­plied claim that ter­mi­nal val­ues are in­de­pen­dent of un­der­stand­ings. I can’t think of any hu­man val­ues that don’t pre­sup­pose some facts.

Edit: also see this com­ment by sci­en­tism.

• If I en­joy the sub­jec­tive ex­pe­rience of think­ing about some­thing, I can’t think of any con­ceiv­able fact that would in­val­i­date that.

• Touché. (At least for an in­stan­ta­neous “I” and in­stan­ta­neous en­joy­ment.) Still, there are many ter­mi­nal val­ues that do pre­sup­pose facts.

• So, I’m ba­si­cally ig­nor­ing the “ter­mi­nal” part of this, for rea­sons I’ve be­la­bored el­se­where and won’t re­peat here.

I agree that there’s a differ­ence be­tween want­ing to do X and want­ing the sub­jec­tive ex­pe­rience of do­ing X. That said, fre­quently peo­ple say they want the former when they would in fact be perfectly satis­fied by the lat­ter, even know­ing it was the lat­ter. But let us as­sume Bob is not one of those peo­ple, he re­ally does want to travel back in time and ride a dinosaur, not just ex­pe­rience do­ing so or hav­ing done so.

I don’t un­der­stand why you say “I want to travel back in time and ride a dinosaur” is mean­ingless. Even grant­ing that it’s im­pos­si­ble (or, to say that more pre­cisely, grant­ing that greater un­der­stand­ing of re­al­ity tends to sharply re­duce its prob­a­bil­ity), how does that make it mean­ingless? You seem to offer “By defi­ni­tion, some­one in Bob’s past rid­ing a dinosaur is not a fu­ture evolu­tion of the pre­sent Bob” as an an­swer to that ques­tion, but that just com­pletely con­fuses me. By defi­ni­tion of what, and why are we us­ing that defi­ni­tion, and why is that im­por­tant?

That said, as far as I can tell its mean­ingless­ness is ir­rele­vant to your ac­tual point. The key point here is that if Bob knew enough about the world, he would give up on de­vot­ing re­sources to re­al­iz­ing his de­sire to go back in time and ride a dinosaur… right? I’m fine with as­sum­ing that; there are lots of mechanisms that could make this true, even if the whole “mean­ingless” thing doesn’t quite work for me.

how do we know that any­thing we value won’t similarly dis­solve un­der suffi­ciently thor­ough de­con­struc­tion?

We don’t know that. In­deed, I ex­pect most of our val­ues are ex­tremely frag­ile, not to men­tion mu­tu­ally op­posed. Any­thing re­sem­bling a pro­cess of “calcu­lat­ing my co­her­ent ex­trap­o­lated vo­li­tion” will, I ex­pect, pro­duce a re­sult that I am as likely to re­ject in hor­ror, or stare at in be­wil­der­ment, as I am to em­brace as valuable. (Add an­other seven billion minds’ val­ues to the mix and I ex­pect this be­comes some­what more true, but prob­a­bly not hugely more true.)

Let’s sup­pose for a minute that all “hu­man val­ues” are dan­gling units [..] What is the ra­tio­nal course of ac­tion in such a situ­a­tion?

The ra­tio­nal course of ac­tion for an agent is to op­ti­mally pur­sue its val­ues.
That is, keep at­tempt­ing to XXX any­way.
What priv­ileges XXX over other al­ter­na­tives is that the agent val­ues XXX more than those al­ter­na­tives.
There’s no pre­com­mit­ment in­volved… if the agents val­ues change such that it no longer val­ues XXX but rather YYY, at that point it ought to stop pur­su­ing XXX and pur­sue YYY in­stead. It may well re­gret hav­ing pre­vi­ously pur­sued XXX at that time. It may even pre­dict its later re­gret of pur­su­ing XXX, and its ra­tio­nal course of ac­tion is still to pur­sue XXX.
Of course, if it hap­pens to value its fu­ture value-satis­fac­tion, then YYY is part of XXX to the ex­tent that the later value-shift is ex­pected.

More gen­er­ally: you seem to want your val­ues to be jus­tified in terms of some­thing else.
Do you have any co­her­ent no­tion of what that “some­thing else” might be, or what prop­er­ties it might have?

• Peter de Blanc wrote a pa­per on this topic.

• I think this post is ask­ing a very im­por­tant and valuable ques­tion. How­ever, I think it’s limit­ing the pos­si­ble an­swers by mak­ing some un­nec­es­sary and un­jus­tified as­sump­tions. I agree that Bob, as de­scribed, is screwed, but I think we are suffi­ciently un­like Bob that that con­clu­sion does not ap­ply to us.

As TheOtherDave says here,

I don’t un­der­stand why you say “I want to travel back in time and ride a dinosaur” is mean­ingless. Even grant­ing that it’s im­pos­si­ble (or, to say that more pre­cisely, grant­ing that greater un­der­stand­ing of re­al­ity tends to sharply re­duce its prob­a­bil­ity), how does that make it mean­ingless? You seem to offer “By defi­ni­tion, some­one in Bob’s past rid­ing a dinosaur is not a fu­ture evolu­tion of the pre­sent Bob” as an an­swer to that ques­tion, but that just com­pletely con­fuses me. By defi­ni­tion of what, and why are we us­ing that defi­ni­tion, and why is that im­por­tant?

That was my re­ac­tion as well. In par­tic­u­lar, “I want to go back in time and ride a dinosaur” is, on close in­spec­tion ac­tu­ally a rather vague value. It has many pos­si­ble con­crete in­ter­pre­ta­tions and re­al­iza­tions. One more spe­cific ver­sion of it is, “I want a phys­i­cally fu­ture evolu­tion of my­self to ride a dinosaur in my phys­i­cal past.” As the post points out, this is im­pos­si­ble. But why priv­ilege this par­tic­u­lar in­ter­pre­ta­tion of the goal?

You say that Bob wants not just the sub­jec­tive ex­pe­rience but also ob­jec­tive fact of rid­ing a dinosaur. If that’s all he wants, then you’re right, he’s shit out of luck. I sus­pect though that we are not like Bob, and that our ac­tual val­ues are of the more vague sort with many pos­si­ble re­al­iza­tions. And some of these will turn out to be mean­ingful and re­al­iz­able and some won’t be.

If that’s so then the solu­tion is to figure out which are the re­al­iz­able in­ter­pre­ta­tions of our goals and work to­wards those. I’m hope­ful that this is not the empty set.

• You’re right that a mean­ingless goal can­not be pur­sued, but nor can you be said to even at­tempt to pur­sue it—i.e., the pur­suit of a mean­ingless goal is it­self a mean­ingless ac­tivity. Bob can’t put any effort into his goal of time travel, he can only con­fus­edly do things he mis­tak­enly thinks of as “pur­su­ing the goal of time travel”, be­cause pur­su­ing the goal of time travel isn’t a pos­si­ble ac­tivity. What Bob has learned is that he wasn’t pur­su­ing the goal of time travel to be­gin with. He was al­to­gether wrong about hav­ing a ter­mi­nal value of trav­el­ling back in time and rid­ing a dinosaur be­cause there’s no such thing.

• That seems ob­vi­ously wrong to me. There’s noth­ing at all pre­vent­ing me from de­sign­ing an in­visi­ble-pink-uni­corn max­i­mizer, even if in­visi­ble pink uni­corns are im­pos­si­ble. For that mat­ter, if we al­low coun­ter­fac­tu­als, an in­visi­ble-pink-uni­corn max­i­mizer still looks like an in­tel­li­gence de­signed to max­i­mize uni­corns—in the coun­ter­fac­tual uni­verse where uni­corns ex­ist, the in­tel­li­gence takes ac­tions that tend to max­i­mize uni­corns.

• How would you em­piri­cally dis­t­in­guish be­tween your in­visi­ble-pink-uni­corn max­i­mizer and some­thing that wasn’t an in­visi­ble-pink-uni­corn max­i­mizer? I mean, you could look for a sec­tion of code that was in­ter­pret­ing sen­sory in­puts as num­ber of in­visi­ble-pink-uni­corns—ex­cept you couldn’t, be­cause there’s no set of sen­sory in­puts that cor­re­sponds to that, be­cause they’re im­pos­si­ble. If we’re talk­ing about coun­ter­fac­tu­als, the coun­ter­fac­tual uni­verse in which the sen­sory in­puts that cur­rently cor­re­spond to pa­per­clips cor­re­spond to in­visi­ble-pink-uni­corns seems just as valid as any other.

• Well, there’s cer­tainly a set of sen­sory in­puts that cor­re­sponds to /​in­visi­ble-uni­corn/​, based on which one could build an in­visi­ble uni­corn de­tec­tor. Similarly, there’s a set of sen­sory in­puts that cor­re­sponds to /​pink-uni­corn/​, based on which one could build a pink uni­corn de­tec­tor.

If I wire a pink uni­corn de­tec­tor up to an in­visi­ble uni­corn de­tec­tor such that a light goes on iff both de­tec­tors fire on the same ob­ject, have I not just con­structed an in­visi­ble-pink-uni­corn de­tec­tor?

Granted, a de­tec­tor is not the same thing as a max­i­mizer, but the con­cep­tual is­sue seems iden­ti­cal in both cases.

• If I wire a pink uni­corn de­tec­tor up to an in­visi­ble uni­corn de­tec­tor such that a light goes on iff both de­tec­tors fire on the same ob­ject, have I not just con­structed an in­visi­ble-pink-uni­corn de­tec­tor?

Maybe. Or maybe you’ve con­structed a square-cir­cle de­tec­tor; no ex­per­i­ment would let you tell the differ­ence, no?

I think the way around this is some no­tion of which kind of coun­ter­fac­tu­als are valid and which aren’t. I’ve seen posts here (and need to read more) about eval­u­at­ing these coun­ter­fac­tu­als via surgery on causal graphs. But while I can see how such rea­son­ing would work an ob­ject that ex­ists in a differ­ent pos­si­ble world (i.e. a “con­tin­gently nonex­is­tent” ob­ject) I don’t (yet?) see how to ap­ply it to a log­i­cally im­pos­si­ble (“nec­es­sar­ily nonex­is­tent”) ob­ject. Is there a good no­tion available that can say one coun­ter­fac­tu­als in­volv­ing such things is more valid than an­other?

• Or maybe you’ve con­structed a square-cir­cle de­tec­tor; no ex­per­i­ment would let you tell the differ­ence, no?

Take the thing apart and test its com­po­nents in iso­la­tion. If in iso­la­tion they test for squares and cir­cles, their ag­gre­gate is a square-cir­cle de­tec­tor (which never fires). If in iso­la­tion they test for pink uni­corns and in­visi­ble uni­corns, their ag­gre­gate is an in­visi­ble-pink-uni­corn de­tec­tor (which never fires).

• ex­cept you couldn’t, be­cause there’s no set of sen­sory in­puts that cor­re­sponds to that, be­cause they’re im­pos­si­ble.

That does not fol­low. I’ll ad­mit my origi­nal ex­am­ple is mildly flawed, but let’s tack on some­thing (that’s still im­pos­si­ble) to illus­trate my point: in­visi­ble pink telekinetic uni­corns. Still not a thing that can ex­ist, if you define telekine­sis as “ac­tion at a dis­tance, not me­di­ated through one of the four fun­da­men­tal forces.” But now, if you see an ob­ject sta­bly float­ing in vac­uum, and de­tect no grav­i­ta­tional or elec­tro­mag­netic anoma­lies (and you’re in an ac­cel­er­ated refer­ence frame like the sur­face of the earth, etc etc), you can in­fer the pres­ence of an in­visi­ble telekinetic some­thing.

Or in gen­eral—an im­pos­si­ble ob­ject will have an im­pos­si­ble set of sen­sory in­puts, but the set of cor­re­spond­ing sen­sory in­puts still ex­ists.

• if you define telekine­sis as “ac­tion at a dis­tance, not me­di­ated through one of the four fun­da­men­tal forces.”

Yeah, spooky ac­tion at a dis­tance :-) Nowa­days we usu­ally call it “quan­tum en­tan­gle­ment” :-D

• … I’m pretty sure no ar­range­ment of en­tan­gled par­ti­cles will cre­ate an ob­ject that just hov­ers a half-foot above the Earth’s sur­face.

• Thank you, I think you ar­tic­u­lated bet­ter than any­body so far what I mean by a goal turn­ing out to be mean­ingless.

Do you be­lieve that a goal must per­sist down the the most fun­da­men­tal re­duc­tion­ist level in or­der to re­ally be a goal?

If not, can/​should meth­ods be em­ployed in the pur­suit of a goal such that the meth­ods ex­ist at a lower level than the goal it­self?

• I’m not quite sure what you’re say­ing. I don’t think there’s a way to iden­tify whether a goal is mean­ingless at a more fun­da­men­tal level of de­scrip­tion. Ob­vi­ously Bob would be prone to say things like “to­day I did x in pur­suit of my goal of time travel” but there’s no way of tel­ling that it’s mean­ingless at any other level than that of mean­ing, i.e., with re­spect to lan­guage. Other than that, it seems to me that he’d be do­ing pretty much the same things, phys­i­cally speak­ing, as some­one pur­su­ing a mean­ingful goal. He might even do use­ful things, like make break­throughs in the­o­ret­i­cal physics, de­spite be­ing wholly con­fused about what he’s do­ing.

• All our val­ues are fal­lible, but doubt re­quires jus­tifi­ca­tion.

• When you said to sup­pose that “ev­ery­thing we want is [im­pos­si­ble]”, did you mean that liter­ally? Be­cause nor­mally if what you want is im­pos­si­ble, you should start want­ing a differ­ent thing (or do that su­per-saiyan effort thing if it’s that kind of im­pos­si­ble), but if ev­ery­thing is im­pos­si­ble, you couldn’t do that ei­ther. If there is no pos­si­ble ac­tion that pro­duces a fa­vor­able out­come, I can think of no rea­son to act at all.

(Of course, if I found my­self in that situ­a­tion, I would as­sume I made a math er­ror or some­thing and start try­ing to do things that I want and that I think I messed up when I de­cided that the thing was im­pos­si­ble.)

If you didn’t mean -ev­ery­thing-, then why not just start pur­su­ing the thing which gives the most value which is pos­si­ble to do?

Per­haps I mi­s­un­der­stood the ques­tion?

• I didn’t mean it liter­ally. I meant, ev­ery­thing on which we base our long-term plans.

For ex­am­ple:

You go to school, save up money, try to get a good job, try to ad­vance in your ca­reer… on the be­lief that you will find the re­sults re­ward­ing. How­ever, this is pretty eas­ily dis­man­tled if you’re not a life-ex­ten­sion­ist and/​or cry­on­i­cist (and don’t be­lieve in an af­ter­life). All it takes is for you to have the re­al­iza­tion that

1) If your mem­ory of an ex­pe­rience is erased thor­oughly enough (and you don’t have ac­cess to any­thing ex­ter­nal that will have been al­tered by the ex­pe­rience) then the ex­pe­rience might as well have not hap­pened. Or in­so­far that it al­tered you through some other way than your mem­o­ries, is in­ter­change­able with any other ex­pe­rience that would have al­tered you in the same way.

2) In the ab­sence of an af­ter­life, if you die all your mem­o­ries get per­ma­nently deleted shortly af­ter, and you have no fur­ther ac­cess to any­thing in­fluenced by your past ex­pe­riences in­clud­ing your­self. There­fore, death robs you of your past, pre­sent, and fu­ture mak­ing it as if you had never lived. Ob­vi­ously other peo­ple will re­mem­ber you for a while, but you will have no aware­ness of that be­cause you will sim­ply not ex­ist.

There­fore, no mat­ter what you do, it will get can­cel­led out com­pletely. The way around it is to make a su­per­hu­man effort at do­ing the not-liter­ally-pro­hibited-by-physics-as-far-as-we-know kind of im­pos­si­ble by work­ing to make cry­on­ics, anti-ag­ing, up­load­ing, or AI (which pre­sum­ably will then do one of the pre­ced­ing three for you) pos­si­ble. But per­haps at an even deeper level our idea of what it is these courses of ac­tion are at­tempt­ing to pre­serve is it­self self-con­tra­dic­tory.

Does that nec­es­sar­ily dis­credit these courses of ac­tion?

• If your mem­ory of an ex­pe­rience is erased [...] then the ex­pe­rience might as well have not hap­pened.

Why? If I have to choose be­tween “happy for an hour, then mem­ory-wiped” and “mis­er­able for an hour, then mem­ory-wiped” I un­hesi­tat­ingly choose the former. Why should the fact that I won’t re­mem­ber it mean that there’s no differ­ence at all be­tween the two? One of them in­volves some­one be­ing happy for an hour and the other some­one be­ing mis­er­able for an hour.

death robs you of your past, pre­sent, and fu­ture mak­ing it as if you had never lived.

How so? Ob­vi­ously my ex­pe­rience 100 years from now (i.e., no ex­pe­rience since I will most likely be very dead) will be the same as if I had never lived. But why on earth should what I care about now be de­ter­mined by what I will be ex­pe­rienc­ing in 100 years?

I don’t un­der­stand this ar­gu­ment when I hear it from re­li­gious apol­o­gists (“Without our god ev­ery­thing is mean­ingless, be­cause in­finitely many years from now you will no longer ex­ist! You need to de­rive all the mean­ing in your life from the whims of an alien su­per­be­ing!”) and I don’t un­der­stand it here ei­ther.

• If you know you will be mem­ory-wiped af­ter an hour, it does not make sense to make long-term plans. For ex­am­ple, you can read a book you en­joy, if you value the feel­ing. But if you read a sci­en­tific book, I think the plea­sure from learn­ing would be some­what spoiled by know­ing that you are go­ing to for­get this all soon. The learn­ing would mostly be­come a lost pur­pose, un­less you can use the learned knowl­edge within the hour.

Know­ing that you are un­likely to be al­ive af­ter 100 years pre­vents you from mak­ing some plans which would be mean­ingful in a par­allel uni­verse where you are likely to live 1000 years. Some of those plans are good ac­cord­ing to the val­ues you have now, but are out­side of your reach. Thus fu­ture death does not make life com­pletely mean­ingless, but it ru­ins some value even now.

• I do agree that there are things you might think you want that don’t re­ally make sense given that in a few hun­dred years you’re likely to be long dead and your in­fluence on the world is likely to be lost in the noise.

But that’s a long way from say­ing—as bokov seems to be—that this in­val­i­dates “ev­ery­thing on which we base our long-term plans”.

I wouldn’t spend the next hour read­ing a sci­en­tific book if I knew that at the end my brain would be re­set to its prior state. But I will hap­pily spend time read­ing a sci­en­tific book if, e.g., it will make my life more in­ter­est­ing for the next few years, or lead to higher in­come which I can use to re­tire ear­lier, buy nicer things, or give to char­ity, even if all those benefits take place only over (say) the next 20 years.

Per­haps I’m un­usual, or per­haps I’m fool­ing my­self, but it doesn’t seem to me as if my long-term plans, or any­one else’s, are pred­i­cated on liv­ing for ever or hav­ing in­fluence that lasts for hun­dreds of years.

• First of all, I’m re­ally glad we’re hav­ing this con­ver­sa­tion.

This ques­tion is the one philo­soph­i­cal is­sue that has been bug­ging me for sev­eral years. I read through your post and your com­ments and felt like some­one was fi­nally ask­ing this ques­tion in a way that has a chance of be­ing un­der­stood well enough to be re­solved!

… then I be­gan read­ing the replies, and it’s a strange thing, the in­fer­en­tial dis­tance is so great in some places that I also be­gin to lose the mean­ing of your origi­nal ques­tion, even though I have the very same ques­tion.

Tak­ing a step back—there is some­thing fun­da­men­tally ir­ra­tional about my per­sonal con­cept of iden­tity, ex­is­tence and mor­tal­ity.

I walk around with this sub­jec­tive ex­pe­rience that I am so im­por­tant, and my life is so im­por­tant, and I want to live always. On the other hand, I know that my con­scious­ness is not im­por­tant ob­jec­tively. There are two rea­sons for this. First, there is no ob­jec­tive moral­ity—no ‘judger’ out­side my­self. This raises some is­sues for me, but since Less Wrong can ad­dress this to some ex­tent, pos­si­bly more fully, lets put this aside for the time be­ing. Se­condly, even by my own sub­jec­tive stan­dards, my own con­scious­ness is not im­por­tant. In the as­pects that mat­ter to me, my con­scious­ness and iden­tity is iden­ti­cal to that of an­other. Me and my fam­ily could be re­placed by an­other and I re­ally don’t mind. (We could be re­placed with suffi­ciently com­plex alien en­tities, and I don’t mind, or with com­puter simu­la­tions of en­tities I might not even rec­og­nize as per­sons, and I don’t mind, etc.)

So why does ev­ery­thing—in par­tic­u­lar—my longevity and my hap­piness mat­ter so much to me?

Some­times I try to ex­plain it in the fol­low­ing way: al­though “cere­brally” I should not care, I do ex­ist, as a biolog­i­cal or­ganism that is the product of evolu­tion, and so I do care. I want to feel com­fortable and happy, and that is a biolog­i­cal fact.

But I’m not re­ally satis­fied with this its-just-a-fact-that-I-care ex­pla­na­tion. It seems that if I was more fully ra­tio­nal, I would (1) be able to as­similate in a more com­plete way that I am go­ing to not ex­ist some­time (I no­tice I con­tinu­ally act and feel as though my ex­is­tence is for­ever, and this is tied in with con­tin­u­ing to in­vest in my val­ues even though they in­sist they want to be tied to some­thing that is ob­jec­tively real) and (2) more con­sis­tently re­al­ize in a cere­bral rather than biolog­i­cal way that my val­ues and my hap­piness are not im­por­tant to cere­bral-me … and al­low this to af­fect my be­hav­ior.

I’ve had this ques­tion for­ever, but I used to frame it as a the­ist. My ob­ser­va­tion as a child was that you worry about these things un­til you’re in an ex­is­ten­tial frenzy, and then you go down­stairs and eat a turkey sand­wich. There’s no re­s­olu­tion, so you just let biol­ogy take over.

But it seems there ought to be a re­s­olu­tion, or at the very least a moniker for the prob­lem that could be used to point to it when­ever you want to bring it up.

• Can you say more about why “it’s just a fact that I care” is not satis­fy­ing? Be­cause from my per­spec­tive that’s the proper re­s­olu­tion… we value what we value, we don’t value what we don’t value, what more is there to say?

• It is a fact that I care, we agree.

Per­haps the is­sue is that I be­lieve I should not care—that if I was more ra­tio­nal, I would not care.

That my val­ues are based on a mi­s­un­der­stand­ing of re­al­ity, just as the ti­tle of this post.

In par­tic­u­lar, my val­ues seem to be pinned on ideas that are not true—that states of the uni­verse mat­ter, ob­jec­tively rather than just sub­jec­tively, and that I ex­ist for­ever/​always.

This “pin­ning” doesn’t seem to be that crit­i­cal—life goes on, and I eat a turkey sand­wich when I get hun­gry. But it seems un­for­tu­nate that I should un­der­stand cere­brally (to the ex­tent that I am ca­pa­ble) that my val­ues are based on an illu­sion, but that my biol­ogy de­mands that I keep on as though my val­ues were based on some­thing real. To be very dra­matic, it is like some con­cept of my ‘self’ is trapped in this non-non­sen­si­cal ma­chine that keeps on eat­ing and en­joy­ing and car­ing like Sisy­phus.

Put this way, it just sounds like a dis­con­nect in the way our hard­ware and soft­ware evolved—my brain has evolved to think about how to satis­fy­ing cer­tain goals sup­plied by biol­ogy, which of­ten in­cludes the meta-prob­lem of pri­ori­tiz­ing and eval­u­at­ing these goals. The biol­ogy doesn’t care if the an­swer re­turned is ‘mu’ in the re­cur­sion, and fur­ther­more doesn’t care if I’m at a step in this evolu­tion where check­ing-out of the simu­la­tion-I’m-in seems just as rea­son­able an an­swer as any other course of ac­tion.

For­tu­nately, my or­ganism just ig­nores those nihilis­tic opines. (Per­haps this ig­nor­ing also evolved, so­cially or more fun­da­men­tally in the hard­ware, as well.) I say for­tu­nately, be­cause I have other goals be­sides Tarski, or find­ing re­s­olu­tions to these value co­nun­drums.

• In par­tic­u­lar, my val­ues seem to be pinned on ideas that are not true—that states of the uni­verse mat­ter, ob­jec­tively rather than just sub­jec­tively, and that I ex­ist for­ever/​always.

Well, if they are, and if I un­der­stand what you mean by “pinned on,” then we should ex­pect the strength of those val­ues to weaken as you stop in­vest­ing in those ideas.

I can’t tell from your dis­cus­sion whether you don’t find this to be true (in which case I would ques­tion what makes you think the val­ues are pinned on the ideas in the first place), or whether you’re un­able to test be­cause you haven’t been able to stop in­vest­ing in those ideas in the first place.

If it’s the lat­ter, though… what have you tried, and what failure modes have you en­coun­tered?

• My val­ues seem to be pinned on these ideas (the ones that are not true) be­cause while I am in the pro­cess of car­ing about the things I care about, and es­pe­cially when I am mak­ing a choice about some­thing, I find that I am always mak­ing the as­sump­tion that these ideas are true—that the states of the uni­verse mat­ter and that I ex­ist for­ever.

When it oc­curs to me to re­mem­ber that these as­sump­tions are not true, I feel a great deal of cog­ni­tive dis­so­nance. How­ever, the cog­ni­tive dis­so­nance has no re­s­olu­tion. I think about it for a lit­tle while, go about my busi­ness, and dis­cover some time later I for­got again.

I don’t know if a spe­cific ex­am­ple will help or not. I am driv­ing home, in traf­fic, and brain is hap­pily buzzing with thoughts. I am think­ing about all the peo­ple in cars around me and how I’m part of a huge so­cial net­work and whether the traf­fic is as effi­cient as it could be and civ­i­liza­tion and how I am go­ing to go home and what I am go­ing to do. And then I re­mem­ber about death, the snuf­fing out of my aware­ness, and some­thing about that just doesn’t con­nect. It’s like I can em­pathize with my own non-ex­is­tence (hope­fully this ex­am­ple is some­thing more than just a mo­ment of psy­cholog­i­cal di­s­or­der) and I feel that my cur­rent ex­is­tence is a mirage. Or rather, the moral weight that I’ve given it doesn’t make sense. That’s what the cog­ni­tive dis­so­nance feels like.

• I want to add that I don’t be­lieve I am that un­usual. I think this need for an ob­jec­tive moral­ity (ob­jec­tive value sys­tem) is why some peo­ple are nat­u­rally the­ists.

I also think that peo­ple who think wire-head­ing is a failure mode, must be in the same boat that I’m in.

• we value what we value, we don’t value what we don’t value, what more is there to say?

I’m con­fused what you mean by this. If there wasn’t any­thing more to say, then no­body would/​should ever change what they value? But peo­ple’s val­ues changes over time, and that’s a good thing. For ex­am­ple in me­dieval/​an­cient times peo­ple didn’t value an­i­mals’ lives and well-be­ing (as much) as we do to­day. If a me­dieval per­son tells you “well we value what we value, I don’t value an­i­mals, what more is there to say?”, would you agree with him and let him go on to burn­ing cats for en­ter­tain­ment, or would you try to con­vince him that he should ac­tu­ally care about an­i­mals’ well-be­ing?

You are of course us­ing some of your val­ues to in­struct other val­ues. But they need to be at least con­sis­tent and it’s not re­ally clear which are the “more-ter­mi­nal” ones. It seems to me byrnema is say­ing that priv­ileg­ing your own con­scious­ness/​iden­tity above oth­ers is just not war­ranted, and if we could, we re­ally should self-mod­ify to not care more about one par­tic­u­lar in­stance, but rather about how much well-be­ing/​eu­daimo­nia (for ex­am­ple) there is in the world in gen­eral. It seems like this change would make your value sys­tem more con­sis­tent and less ar­bi­trary and I’m sym­pa­thetic to this view.

• But peo­ple’s val­ues changes over time, and that’s a good thing. For ex­am­ple in me­dieval/​an­cient times peo­ple didn’t value an­i­mals’ lives and well-be­ing (as much) as we do to­day. If a me­dieval per­son tells you “well we value what we value, I don’t value an­i­mals, what more is there to say?”, would you agree with him and let him go on to burn­ing cats for en­ter­tain­ment, or would you try to con­vince him that he should ac­tu­ally care about an­i­mals’ well-be­ing?

Is that an ac­tual change in val­ues? Or is it merely a change of facts—much greater availa­bil­ity of en­ter­tain­ment, much less death and cru­elty in the world, and the knowl­edge that hu­mans and an­i­mals are much more similar than it would have seemed to the me­dieval wor­ld­view?

• The more I think about this ques­tion, the less cer­tain I am that I know what an an­swer to it might even look like.
What kinds of ob­ser­va­tions might be ev­i­dence one way or the other?

• Do peo­ple who’ve changed their mind con­sider them­selves to have differ­ent val­ues from their past selves? Do we find that when some­one has changed their mind, we can ex­plain the rele­vant val­ues in terms of some “more fun­da­men­tal” value that’s just be­ing ap­plied to differ­ent ob­ser­va­tions (or differ­ent rea­son­ing), or not? Can we imag­ine a sce­nario where an en­tity with truly differ­ent val­ues—the good ol’ pa­per­clip max­i­mizer—is per­suaded to change them?

I guess that’s my real point—I wouldn’t even dream of try­ing to per­suade a pa­per­clip max­i­mizer to start valu­ing hu­man life (ex­cept in­so­far as live hu­mans en­courage the pro­duc­tion of pa­per­clips) - it val­ues what it val­ues, it doesn’t value what it doesn’t value, what more is there to say? To the ex­tent that I would hope to per­suade a me­dieval per­son to act more kindly to­wards an­i­mals, it would be be­cause and in terms of the val­ues that they already have, that would likely be mostly shared with mine.

• So, if I start out treat­ing an­i­mals badly, and then later start treat­ing them kindly, that would be ev­i­dence of a pre-ex­ist­ing valu­ing of an­i­mals which was sim­ply be­ing masked by cir­cum­stances. Yes?

If I in­stead start out act­ing kindly to an­i­mals, and then later start treat­ing them badly, is that similarly ev­i­dence of a pre-ex­ist­ing lack of valu­ing-an­i­mals which had pre­vi­ously been masked by cir­cum­stances? Or does it in­di­cate that my ex­ist­ing, pre­vi­ously man­i­fested, valu­ing of an­i­mals is now be­ing masked by cir­cum­stances?

• So, if I start out treat­ing an­i­mals badly, and then later start treat­ing them kindly, that would be ev­i­dence of a pre-ex­ist­ing valu­ing of an­i­mals which was sim­ply be­ing masked by cir­cum­stances. Yes?

Either that, or that your pre­sent kind-treat­ing of an­i­mals is just a man­i­fes­ta­tion of cir­cum­stances, not a true value.

If I in­stead start out act­ing kindly to an­i­mals, and then later start treat­ing them badly, is that similarly ev­i­dence of a pre-ex­ist­ing lack of valu­ing-an­i­mals which had pre­vi­ously been masked by cir­cum­stances? Or does it in­di­cate that my ex­ist­ing, pre­vi­ously man­i­fested, valu­ing of an­i­mals is now be­ing masked by cir­cum­stances?

Could be ei­ther. To figure it out, we’d have to ex­am­ine those sur­round­ing cir­cum­stances and see what un­der­ly­ing val­ues seemed con­sis­tent with your ac­tions. Or we could as­sume that your val­ues would likely be similar to those of other hu­mans—so you prob­a­bly value the welfare of en­tities that seem similar to your­self, or po­ten­tial mates or offspring, and so value an­i­mals in pro­por­tion to how similar they seem un­der the cir­cum­stances and available in­for­ma­tion.

• (nods) Fair enough. Thanks for the clar­ifi­ca­tion.

• Well whether it’s a “real” change may be be­sides the point if you put it this way. Our situ­a­tion and our knowl­edge are also chang­ing, and maybe our be­hav­ior should also change. If per­sonal iden­tity and/​or con­scious­ness are not fun­da­men­tal, how should we value those in a world where any mind-con­figu­ra­tions can be cre­ated and copied at will?

• So there’s a view that a ra­tio­nal en­tity should never change its val­ues. If we ac­cept that, then any en­tity with differ­ent val­ues from pre­sent-me seems to be in some sense not a “nat­u­ral suc­ces­sor” of pre­sent-me, even if it re­mem­bers be­ing me and shares all my val­ues. There seems to be a qual­i­ta­tive dis­tinc­tion be­tween an en­tity like that and up­load-me, even if there are sev­eral branch­ing up­load-mes that have un­der­gone var­i­ous ex­pe­riences and would no doubt have differ­ent views on con­crete is­sues than pre­sent-me.

But that’s just an in­tu­ition, and I don’t know whether it can be made rigor­ous.

• Fair enough.

Agreed that if some­one ex­presses (ei­ther through speech or ac­tion) val­ues that are op­posed to mine, I might try to get them to ac­cept my val­ues and re­ject their own. And, sure, hav­ing set out to do that, there’s a lot more to be rele­vantly said about the me­chan­ics of how we hold val­ues, and how we give them up, and how they can be al­tered.

And you’re right, if our val­ues are in­con­sis­tent (which they of­ten are), we can be in this kind of re­la­tion­ship with our­selves… that is, if I can fac­tor my val­ues along two op­posed vec­tors A and B, I might well try to get my­self to ac­cept A and re­ject B (or vice-versa, or both at once). Of course, we’re not obli­gated to do this by any means, but in­ter­nal con­sis­tency is a com­mon thing that peo­ple value, so it’s not sur­pris­ing that we want to do it. So, sure… if what’s go­ing on here is that byrnema has in­con­sis­tent val­ues which can be fac­tored along a “priv­ilege my own iden­tity”/​”don’t priv­ilege my own iden­tity” axis, and they net-value con­sis­tency, then it makes sense for them to at­tempt to self-mod­ify so that one of those vec­tors is sup­pressed.

With re­spect to my state­ment be­ing con­fus­ing… I think you un­der­stood it perfectly, you were just dis­agree­ing—and, as I say, you might well be cor­rect about byrnema. Speak­ing per­son­ally, I seem to value breadth of per­spec­tive and flex­i­bil­ity of view­point sig­nifi­cantly more than in­ter­nal con­sis­tency. “Do I con­tra­dict my­self? Very well, then I con­tra­dict my­self, I am large, I con­tain mul­ti­tudes.”

Of course, I do cer­tainly have both val­ues, and (un­sur­pris­ingly) the parts of my mind that al­ign with the lat­ter value seem to be­lieve that I ought to be more con­sis­tent about this, while the parts of my mind that al­ign with the former don’t seem to have a prob­lem with it.

I find I pre­fer be­ing the parts of my mind that al­ign with the former; we get along bet­ter.

• to value breadth of per­spec­tive and flex­i­bil­ity of view­point sig­nifi­cantly more than in­ter­nal consistency

As hu­mans we can’t change/​mod­ify our­selves too much any­way, but what about if we’re able to in the fu­ture? If you can pick and choose your val­ues? It seems to me that, for such en­tity, not valu­ing con­sis­tency is like not valu­ing logic. And then there’s the ar­gu­ment that it leaves you open for dutch book­ing /​ black­mail.

• Yes, in­con­sis­tency leaves me open for dutch book­ing, which perfect con­sis­tency would not. Elimi­nat­ing that sus­cep­ti­bil­ity is not high on my list of self-im­prove­ments to work on, but I agree that it’s a failing.

Also, per­ceived in­con­sis­tency runs the risk of mak­ing me seen as un­re­li­able, which has so­cial costs. That said, be­ing seen as re­li­able ap­pears to be a fairly vi­able Schel­ling point among my var­i­ous per­spec­tives (as you say, the range is pretty small, globally speak­ing), so it’s not too much of a prob­lem.

In a hy­po­thet­i­cal fu­ture where the tech­nol­ogy ex­ists to rad­i­cally al­ter my val­ues rel­a­tively eas­ily, I prob­a­bly would not care nearly so much about flex­i­bil­ity of view­point as an in­trin­sic skill, much in the same way that elec­tronic calcu­la­tors made the abil­ity to do log­a­r­ithms in my head rel­a­tively val­ue­less.

• My po­si­tion would be that ac­tions speak louder than thoughts. If you act as though you value your own hap­piness more than that of oth­ers… maybe you re­ally do value your own hap­piness more than that of oth­ers? If you like do­ing cer­tain things, maybe you value those things—I don’t see any­thing ir­ra­tional in that.

(It’s perfectly nor­mal to self-de­ceive to be­lieve our val­ues are more self­less than they ac­tu­ally are. I wouldn’t feel guilty about it—similarly, if your ac­tions are good it doesn’t re­ally mat­ter whether you’re do­ing them for the sake of other peo­ple or for your own satis­fac­tion)

The other re­s­olu­tion I can see would be to ac­cept that you re­ally are a set of not-en­tirely-al­igned en­tities, a pat­tern run­ning on un­trusted hard­ware. At which point parts of you can try and change other parts of you. That seems rather per­ilous though. FWIW I ac­cept the meat and its some­times-con­tra­dic­tory de­sires as part of me; it feels mean­ingless to draw lines in­side my own brain.

• The other re­s­olu­tion I can see would be to ac­cept that you re­ally are a set of not-en­tirely-al­igned en­tities, a pat­tern run­ning on un­trusted hard­ware.

Yes, this is where I’m at.

• Does that nec­es­sar­ily dis­credit these courses of ac­tion?

Yes, un­der the as­sump­tion that you only value things that fu­ture-you will feel the effects of. If this is true, then all courses of ac­tion are equally ra­tio­nal and it doesn’t mat­ter what you do—you’re at null.

If you are such a be­ing which val­ues at least one thing that you will not di­rectly ex­pe­rience, then the an­swer is no, these ac­tions can still have worth. Most hu­mans are like this, even if they don’t re­al­ize it.

The way around it is to make a su­per­hu­man effort at do­ing the not-liter­ally-pro­hibited-by-physics-as-far-as-we-know kind of im­pos­si­ble by work­ing to make cry­on­ics, anti-ag­ing, up­load­ing, or AI (which pre­sum­ably will then do one of the pre­ced­ing three for you) pos­si­ble.

Well...you’ll still die even­tu­ally.

• Another way to think about Dave’s situ­a­tion is that his util­ity func­tion as­signs the same value to all pos­si­ble fu­tures (i.e. zero) be­cause the one fu­ture that would’ve been as­signed a non-zero value turned out to be un­re­al­iz­able. His real prob­lem is that his util­ity func­tion has very lit­tle struc­ture: it is zero al­most ev­ery­where.

I sus­pect our/​my/​your util­ity func­tion is struc­tured in a way that even if broad swaths of pos­si­ble fu­tures turn out to be un­re­al­iz­able, the re­main­der will still con­tain gra­di­ents and lo­cal max­ima, so there will be some more de­sir­able and some less de­sir­able pos­si­bil­ities.

Of course this is not guaran­teed, but most util­ity func­tions have gra­di­ents and lo­cal max­ima over most sets. You need a very spe­cial util­ity func­tion and a very spe­cial set of re­al­iz­able fu­tures in or­der for all fu­tures to be as­signed ex­actly the same value.

• 26 Sep 2013 1:27 UTC
2 points

What is the ra­tio­nal course of ac­tion in such a situ­a­tion?

Be­ing able to cast off self-con­tra­dic­tions (A is equal to nega­tion-of-A) is as close as I can offer to a know­able value that won’t dis­solve. But I may be wrong, de­pend­ing on what you mean by suffi­cient de­con­struc­tion. If the de­con­struc­tion is suffi­cient, it is suffi­cient, and there­fore suffi­cient, and you’ve an­swered your own ques­tion: we can­not know. Which leads to the self-con­tra­dic­tion that we know one thing and that is we can­not know any thing in­clud­ing that we can­not know any­thing.

Self-con­tra­dic­tions (and in­finite re­gres­sions) sug­gest a prob­lem with ex­pla­na­tions, not what is ex­plained. The ra­tio­nal course of ac­tion is to not con­fuse the world with a map of the world (in this case, ra­tio­nal­ity) even if the world does con­tain that map. The map is en­tirely in­side the world, the world is not en­tirely in­side the map.

More speci­fi­cally, I spent a lit­tle time to­day an­swer­ing this ques­tion and sev­eral hours in­ter­pret­ing a class­room for a deaf stu­dent. If you re­verse those pri­ori­ties, you get caught up in maps and don’t do much to en­joy or im­prove this nice ter­ri­tory.

• This com­ment by Wei Dai might be rele­vant, also see steven0461′s an­swer.

• No mat­ter what the uni­verse is, all you need for ca­sual de­ci­sion the­ory is that you live in a uni­verse in which your ac­tions have con­se­quences, and you pre­fer some of the pos­si­ble con­se­quences over oth­ers. (you can ad­just and al­ter this sen­tence for your preferred de­ci­sion the­ory)

What if that doesn’t hap­pen? What if you didn’t pre­fer any con­se­quence over any other, and you were quite cer­tain no ac­tion you took would make any differ­ence to any­thing that mat­tered?

Well, it’s not a trick ques­tion … you’ll just act in any ar­bi­trary way. It won’t mat­ter. All ac­tions would be equally ra­tio­nal.

The origi­nal Bob would not want to be this Bob.

That im­plies that the origi­nal Bob val­ues ig­no­rance/​work­ing to­wards goals/​etc in ad­di­tion to rid­ing on dinosaurs in the past. Bob only needs one co­her­ent value to have a rea­son to take cer­tain ac­tions.

The state­ment “The origi­nal Bob would not want to be this Bob.” doesn’t fol­low from the premise “Bob’s ter­mi­nal value is not only ut­terly im­pos­si­ble but mean­ingless”. If that was re­ally the only thing Bob val­ued, then the origi­nal Bob would be ut­terly neu­tral to be­com­ing the en­light­ened Bob, since what hap­pens to Bob and what Bob learns doesn’t mat­ter. All pos­si­ble fu­tures are equally preferred, since none of them are closer to bring­ing Bob trav­el­ling to the past and rid­ing a dinosaur.

• My ter­mi­nal value is my own hap­piness. I know that it ex­ists be­cause I have ex­pe­rienced it, and ex­pe­rience it reg­u­larly. I can’t imag­ine a world in which some­one con­vinces me that I don’t ex­pe­rience some­thing that I ex­pe­rience.

• “Hap­piness” as a con­cept sounds sim­ple in the same way “a witch did it” sounds sim­ple as an ex­pla­na­tion. Most peo­ple con­sider wire­head­ing to be a failure state, and defin­ing “hap­piness” so as to avoid wire­head­ing is not sim­ple.

• Hap­piness as a feel­ing is sim­ple, though it may be caused by com­plex things. If wire­head­ing would make me happy—that is, give me the best pos­si­ble en­joy­able feel­ing in the world—I’d wire­head. I don’t con­sider that a failure state.

• I live my life un­der the as­sump­tion that I do have achiev­able val­ues. If I had no val­ues that I could achieve and I was truly in­differ­ent be­tween all pos­si­ble out­comes, then my de­ci­sions do not mat­ter. I can ig­nore any such pos­si­ble wor­lds in my de­ci­sion the­ory.

• So, to clar­ify:

We don’t know what a perfectly ra­tio­nal agent would do if con­fronted with all goals be­ing epistem­i­cally ir­ra­tional, but there is no in­stru­men­tal value in an­swer­ing this ques­tion be­cause if we found our­selves in such a situ­a­tion we wouldn’t care.

Is that a fair sum­mary? I don’t yet know if I agree or dis­agree, right now I’m just mak­ing sure I un­der­stand your po­si­tion.

• I be­lieve that is a fair sum­mary of my be­liefs.

Side note: Be­fore I was con­vinced by EY’s stance on com­pat­i­bil­ism of free will, I be­lieved in free will for a similar rea­son.

• Luke­prog’s metaethics posts went over this—so how about get­ting the right an­swer for him? :)

• Ter­mi­nal val­ues are part of the map, not the ter­ri­tory.

• What does this mean? Ter­mi­nal val­ues are tech­niques by which we pre­dict fu­ture phe­nomenon? Doesn’t sound like we’re talk­ing about val­ues any­more, but my only un­der­stand­ing of what it would mean for some­thing to be part of the map is that it would be part of how we model the world, i.e. how we pre­dict fu­ture oc­cur­rences.

• What does this mean?

The agents that we de­scribe in philo­soph­i­cal or math­e­mat­i­cal prob­lems have ter­mi­nal val­ues. But what con­fi­dence have we that these prob­lems map ac­cu­rately onto the messy real world? To what ex­tent do the­o­ries that use the “ter­mi­nal val­ues” con­cept ac­cu­rately pre­dict events in the real world? Do peo­ple — or cor­po­ra­tions, na­tions, sub-agents, memes, etc. — be­have as if they had ter­mi­nal val­ues?

I think the an­swer is “some­times” at best.

Some­times hu­mans can be money-pumped or Dutch-booked. Some­times not. Some­times hu­mans can end up in situ­a­tions that look like wire­head­ing, such as heroin ad­dic­tion or ec­static re­li­gion … but some­times they can es­cape them, too. Some­times hu­mans are self­ish, some­times spendthrift, some­times al­tru­is­tic, some­times ap­a­thetic, some­times self-de­struc­tive. Some hu­mans in­sist that they know what hu­mans’ ter­mi­nal val­ues are (go to heaven! have lots of rich, smart ba­bies! spread your memes!) but other hu­mans deny hav­ing any such val­ues.

Hu­mans are (fa­mously) not fit­ness-max­i­miz­ers. I sug­gest that we are not nec­es­sar­ily any­thing-max­i­miz­ers. We are an ar­ti­fact of an in-progress amoral op­ti­miza­tion pro­cess (biolog­i­cal evolu­tion) and pos­si­bly oth­ers (memetic evolu­tion; evolu­tion of so­cioe­co­nomic en­tities); but we may very well not be op­ti­miz­ers our­selves at all.

• They’re the­o­ries by which we pre­dict fu­ture men­tal states (such as satis­fac­tion) - our own or those of oth­ers.

• Heh. It’s even worse than that. The idea that Bob is a sin­gle agent with ter­mi­nal val­ues is likely wrong. There are sev­eral agents com­pris­ing Bob and their ter­mi­nal val­ues change con­stantly, de­pend­ing on the weather.

• An agent op­ti­mized to hu­man­ity’s CEV would in­stantly rec­og­nize that try­ing to skip ahead would be in­cred­ibly harm­ful to our pre­sent psy­chol­ogy; with­out dreams—how­ever ir­ra­tional—we don’t tend to de­velop well in terms of CEV. If all of our val­ues break down over time, a su­per­in­tel­li­gent agent op­ti­mized for our CEV will plan for the day our dreams are bro­ken, and may be able to give us a helping hand and a pat on the back to let us know that there are still rea­sons to live.

This sounds like the same man­ner of fal­lacy as­so­ci­ated with de­ter­minism and the ig­no­rance of the fu­ture be­ing de­rived from the past though the pre­sent rather than by a time­less ex­ter­nal “Deter­mi­na­tor.”

• I think you’re vastly un­der­es­ti­mat­ing the mag­ni­tude of that “helping hand.”

By way of anal­ogy… a su­per­in­tel­li­gent agent op­ti­mized for (or, more to the point, op­ti­miz­ing for) so­lar sys­tem coloniza­tion might well con­clude that es­tab­lish­ing hu­man colonies on Mars is in­cred­ibly harm­ful to our pre­sent phys­iol­ogy, since with­out oxy­gen we don’t tend to de­velop well in terms of breath­ing. It might then de­velop tech­niques to al­ter our lungs, or al­ter the en­vi­ron­ment of Mars in such a way that our lungs can func­tion bet­ter there (e.g., oxy­genate it).

An agent op­ti­miz­ing for some­thing that re­lates to our psy­chol­ogy, rather than our phys­iol­ogy, might similarly de­velop tech­niques to al­ter our minds, or al­ter our en­vi­ron­ment in such a way that our minds can func­tion bet­ter.

• I think you’re vastly un­der­es­ti­mat­ing the mag­ni­tude of my un­der­stand­ing.

In the con­text of some­thing so shock­ing as hav­ing our naive child­hood dreams bro­ken, is there some su­per­in­tel­li­gent solu­tion that’s sup­posed to be more ad­vanced that con­sol­ing you in your mo­ment of grief? To be com­pletely hon­est, I wouldn’t ex­pect a hu­man­ity CEV agent to even bother try­ing to con­sole us; we can do that for each other and it knows this well in ad­vance, it’s got big­ger prob­lems to worry about.

Do you mean to sug­gest that a su­per­in­tel­li­gent agent wouldn’t be able to fore­see or provide solu­tions to some prob­lem that we are ca­pa­ble of dream­ing up to­day?

You’ll have to for­give me, but I’m not see­ing what it is about my com­ment that gives you rea­son think I’m mi­s­un­der­stand­ing any­thing here. Do you ex­pect an agent op­ti­mized to hu­man­ity’s CEV is go­ing to use in­op­ti­mal strate­gies for some rea­son? Will it give a helping in­ter­stel­lar space­ship when re­ally all it needed to do to effec­tively solve what­ever spaceflight-un­re­lated micro­prob­lem in our psy­chol­ogy that ex­ists at the mo­ment be­fore it’s solved the prob­lem was a sim­ple pat on the back?

CEV, I’m feel­ing down...

Have a space­ship! And a dinosaur!

• is there some su­per­in­tel­li­gent solu­tion that’s sup­posed to be more ad­vanced that con­sol­ing you in your mo­ment of grief?

Yes.

Do you mean to sug­gest that a su­per­in­tel­li­gent agent wouldn’t be able to fore­see or provide solu­tions to some prob­lem that we are ca­pa­ble of dream­ing up to­day?

No.

Do you ex­pect an agent op­ti­mized to hu­man­ity’s CEV is go­ing to use in­op­ti­mal strate­gies for some rea­son?

No.

Will it give a helping in­ter­stel­lar space­ship when re­ally all it needed to do to effec­tively solve what­ever spaceflight-un­re­lated micro­prob­lem in our psy­chol­ogy that ex­ists at the mo­ment be­fore it’s solved the prob­lem was a sim­ple pat on the back?

No.

• Fair enough.