I would like to see more discussion of this sort of “AI boxing”.
Typically the implicit assumption I’ve seen is that “The AI is in a box where a limited communication channel enables it to learn about the real universe and have conversations with its designers”, which I’m convinced is unacceptably fragile.
I think a version like “The AI is in a box where the laws of physics are little more than a video game, it doesn’t get any direct information about the real world, and we just watch its behavior in the game world to make inferences about it” might be more interesting. Occam’s Razor isn’t designed to be proof against overwhelming deception like this, and so it might not be too hard to push any credence the AI gives to “this world is an illusion generated within a much more complex outer world” down to negligible values.
I would like to see more discussion of this sort of “AI boxing”.
Typically the implicit assumption I’ve seen is that “The AI is in a box where a limited communication channel enables it to learn about the real universe and have conversations with its designers”, which I’m convinced is unacceptably fragile.
I think a version like “The AI is in a box where the laws of physics are little more than a video game, it doesn’t get any direct information about the real world, and we just watch its behavior in the game world to make inferences about it” might be more interesting. Occam’s Razor isn’t designed to be proof against overwhelming deception like this, and so it might not be too hard to push any credence the AI gives to “this world is an illusion generated within a much more complex outer world” down to negligible values.
I think this is more AI kickboxing.