Containing the AI... Inside a Simulated Reality

So I just finished the paper by Yampolskiy called “Uncontrollability of AI” and it makes for a compelling read. In particular, I was happy to finally see something that explicitly mentions the ludicrous folly of believing it possible to make an AI conform to “human values”—as many posts on this blog make abundantly clear, to be human is to be irrational… asking an AI to conform with our ways of “reasoning” is… well—incoherent, to put it mildly.

But—that is not what this post is about :) I wish to propose a containment method that for some reason has not been especially elaborated on. Some might say it’s another version of AI-in-a-Box, but I disagree. Allow me to explain...

What if the AGI we create is “brought online” inside a simulated reality… A place that, as far as it knows, is the entirety of the world? Let us call this place AISpace.

Now some of you probably are already pre-heating your keyboards to respond with the often repeated (and valid) arguments that “prove” how this won’t work, but let me add a little twist first… as some of you may agree, we have no definitive proof that our world itself is not a simulated reality. Thus, if you feel it inevitable that an AGI must be able to get out of AISpace, releasing an AGI into this world would have to at least leave open the possibility that an AGI on Real Earth (aka Reality as we think we know it) could also “conclude” this actual world is a sim, or decide to find out if it is (and, yes, convert the entire universe into a computer to come up with an answer ;)

If we are unable to definitively settle whether Real Earth is or is not a simulation, why should an AI be able to do so?

Now—of course the above requires a few conditions, some of which may indeed be hard to meet, such as human operators not exposing the fact that AISpace is not in fact all of Reality… and (malevolent or indifferent) actors could always choose to release their AI into the Real World anyhow. What I’d like us to do here on this post is to not repeat the known arguments against AI in a Box, which make sense and are for the most part uncontroversial. Instead, let’s deal with the specific case of a situation where the AGI is sincerely of the “opinion” that its reality (AISpace) is the entire world, and how it then might find out it is not, absent human error.

AISpace would of course need to be created first and the task of designing it may itself be very complex indeed. Though it would be wise to be careful in ascribing the properties it must have before it is “real enough” to be “accepted” as a Reality. In our Real Earth there also might be glitches/errors that to us are merely features we take as a given. For example, if our Real World is actually a sim, perhaps lightning is a programming error—even if it is, how would we ever actually know?

I guess I also would like to say that I am dying to find out the reasoning why this kind of containment would not work, for it must have quite a few very interesting implications. Perhaps, for the sake of the argument, let us define an additional “version” of AISpace as an exact replica of the Real World, but simulated (a.k.a. a human-instantiated Matrix; so at present not yet possible). How on Earth (no pun intended) would the AGI find out it is in fact in a “fake” reality...?

Containing the AI… Inside a Simulated Reality