Be Wary of Thinking Like a FAI

I recently realized that, encouraged by LessWrong, I had been using a heuristic in my philosophical reasoning that I now think is suspect. I’m not accusing anybody else of falling into the same trap; I’m just recounting my own situation for the benefit of all.

I actually am not 100% sure that the heuristic is wrong. I hope that this discussion about it generalizes into a conversation about intuition and the relationship between FAI epistemology and our own epistemology.

The heuristic is this: If the ideal FAI would think a certain way, then I should think that way as well. At least in epistemic matters, I should strive to be like an ideal FAI.

Examples of the heuristic in use are:

--The ideal FAI wouldn’t care about its personal identity over time; it would have no problem copying itself and deleting the original as the need arose. So I should (a) not care about personal identity over time, even if it exists, and (b) stop believing that it exists.

--The ideal FAI wouldn’t care about its personal identity at a given time either; if it was proven that 99% of all observers with its total information set were in fact Boltzmann Brains, then it would continue to act as if it were not a Boltzmann Brain, since that’s what maximizes utility. So I should (a) act as if I’m not a BB even if I am one, and (b) stop thinking it is even a meaningful possibility.

--The ideal FAI would think that the specific architecture it is implemented on (brains, computers, nanomachines, giant look-up tables) is irrelevant except for practical reasons like resource efficiency. So, following its example, I should stop worrying about whether e.g. a simulated brain would be conscious.

--The ideal FAI would think that it was NOT a “unified subject of experience” or an “irreducible substance” or that it was experiencing “ineffable, irreducible quale,” because believing in those things would only distract it from understanding and improving its inner workings. Therefore, I should think that I, too, am nothing but a physical mechanism and/​or an algorithm implemented somewhere but capable of being implemented elsewhere.

--The ideal FAI would use UDT/​TDT/​etc. Therefore I should too.

--The ideal FAI would ignore uncomputable possibilities. Therefore I should too.

...

Arguably, most if not all of the conclusions I drew in the above are actually correct. However, I think that the heuristic is questionable, for the following reasons:

(1) Sometimes what we think of as the ideal FAI isn’t actually ideal. Case in point: The final bullet above about uncomputable possibilities. We intuitively think that uncomputable possibilites ought to be countenanced, so rather than overriding our intuition when presented with an attractive theory of the ideal FAI (in this case AIXI) perhaps we should keep looking for an ideal that better matches our intuitions.

(2) The FAI is a tool for serving our wishes; if we start to think of ourselves as being fundamentally the same sort of thing as the FAI, our values may end up drifting badly. For simplicity, let’s suppose the FAI is designed to maximize happy human life-years. The problem is, we don’t know how to define a human. Do simulated brains count? What about patterns found inside rocks? What about souls, if they exist? Suppose we have the intuition that humans are indivisible entities that persist across time. If we reason using the heuristic I am talking about, we would decide that, since the FAI doesn’t think it is an indivisible entity that persists across time, we shouldn’t think we are either. So we would then proceed to tell the FAI “Humans are naught but a certain kind of functional structure,” and (if our overruled intuition was correct) all get killed.

Thoughts?

...

Note 1: “Intuitions” can (I suspect) be thought of as another word for “Priors.”

Note 2: We humans are NOT solomonoff-induction-approximators, as far as I can tell. This bodes ill for FAI, I think.