Be Wary of Thinking Like a FAI

I re­cently re­al­ized that, en­couraged by LessWrong, I had been us­ing a heuris­tic in my philo­soph­i­cal rea­son­ing that I now think is sus­pect. I’m not ac­cus­ing any­body else of fal­ling into the same trap; I’m just re­count­ing my own situ­a­tion for the benefit of all.

I ac­tu­ally am not 100% sure that the heuris­tic is wrong. I hope that this dis­cus­sion about it gen­er­al­izes into a con­ver­sa­tion about in­tu­ition and the re­la­tion­ship be­tween FAI episte­mol­ogy and our own episte­mol­ogy.

The heuris­tic is this: If the ideal FAI would think a cer­tain way, then I should think that way as well. At least in epistemic mat­ters, I should strive to be like an ideal FAI.

Ex­am­ples of the heuris­tic in use are:

--The ideal FAI wouldn’t care about its per­sonal iden­tity over time; it would have no prob­lem copy­ing it­self and delet­ing the origi­nal as the need arose. So I should (a) not care about per­sonal iden­tity over time, even if it ex­ists, and (b) stop be­liev­ing that it ex­ists.

--The ideal FAI wouldn’t care about its per­sonal iden­tity at a given time ei­ther; if it was proven that 99% of all ob­servers with its to­tal in­for­ma­tion set were in fact Boltz­mann Brains, then it would con­tinue to act as if it were not a Boltz­mann Brain, since that’s what max­i­mizes util­ity. So I should (a) act as if I’m not a BB even if I am one, and (b) stop think­ing it is even a mean­ingful pos­si­bil­ity.

--The ideal FAI would think that the spe­cific ar­chi­tec­ture it is im­ple­mented on (brains, com­put­ers, nanoma­chines, gi­ant look-up ta­bles) is ir­rele­vant ex­cept for prac­ti­cal rea­sons like re­source effi­ciency. So, fol­low­ing its ex­am­ple, I should stop wor­ry­ing about whether e.g. a simu­lated brain would be con­scious.

--The ideal FAI would think that it was NOT a “unified sub­ject of ex­pe­rience” or an “ir­re­ducible sub­stance” or that it was ex­pe­rienc­ing “in­ef­fable, ir­re­ducible quale,” be­cause be­liev­ing in those things would only dis­tract it from un­der­stand­ing and im­prov­ing its in­ner work­ings. There­fore, I should think that I, too, am noth­ing but a phys­i­cal mechanism and/​or an al­gorithm im­ple­mented some­where but ca­pa­ble of be­ing im­ple­mented el­se­where.

--The ideal FAI would use UDT/​TDT/​etc. There­fore I should too.

--The ideal FAI would ig­nore un­com­putable pos­si­bil­ities. There­fore I should too.

...

Ar­guably, most if not all of the con­clu­sions I drew in the above are ac­tu­ally cor­rect. How­ever, I think that the heuris­tic is ques­tion­able, for the fol­low­ing rea­sons:

(1) Some­times what we think of as the ideal FAI isn’t ac­tu­ally ideal. Case in point: The fi­nal bul­let above about un­com­putable pos­si­bil­ities. We in­tu­itively think that un­com­putable pos­si­bil­ites ought to be coun­te­nanced, so rather than over­rid­ing our in­tu­ition when pre­sented with an at­trac­tive the­ory of the ideal FAI (in this case AIXI) per­haps we should keep look­ing for an ideal that bet­ter matches our in­tu­itions.

(2) The FAI is a tool for serv­ing our wishes; if we start to think of our­selves as be­ing fun­da­men­tally the same sort of thing as the FAI, our val­ues may end up drift­ing badly. For sim­plic­ity, let’s sup­pose the FAI is de­signed to max­i­mize happy hu­man life-years. The prob­lem is, we don’t know how to define a hu­man. Do simu­lated brains count? What about pat­terns found in­side rocks? What about souls, if they ex­ist? Sup­pose we have the in­tu­ition that hu­mans are in­di­visi­ble en­tities that per­sist across time. If we rea­son us­ing the heuris­tic I am talk­ing about, we would de­cide that, since the FAI doesn’t think it is an in­di­visi­ble en­tity that per­sists across time, we shouldn’t think we are ei­ther. So we would then pro­ceed to tell the FAI “Hu­mans are naught but a cer­tain kind of func­tional struc­ture,” and (if our over­ruled in­tu­ition was cor­rect) all get kil­led.

Thoughts?

...

Note 1: “In­tu­itions” can (I sus­pect) be thought of as an­other word for “Pri­ors.”

Note 2: We hu­mans are NOT solomonoff-in­duc­tion-ap­prox­i­ma­tors, as far as I can tell. This bodes ill for FAI, I think.