Kerrigan comments on AGI Safety FAQ / all-dumb-questions-allowed thread

Kerrigan 17 Dec 2022 23:22 UTC
1 point
0
What about multiple layers (or levels) of anthropic capture? Humanity, for example, could not only be in a simulation, but be multiple layers of simulation deep.

If an advanced AI thought that it could be 1000 layers of simulation deep, it could be turned off by agents in any of the 1000 “universes” above. So it would have to satisfy the desires of agents in all layers of the simulation.

It seems that a good candidate for behavior that would satisfy all parties in every simulation layer would be optimizing “moral rightness”, or MR. (term taken from Nick Bostrom’s Superintelligence).

We could either try to create conditions to maximize the AIs perceived likelihood of being in as many layers of simulation possible, and/or try to create conditions such that the AIs behavior gets less impactful on its utility function the fewer levels of simulation there are, so that it acts as if it were in many layers of simulation.

Or what about actually putting it in many layers of simulation, with a trip wire if it gets out of the bottom simulation?
- mruwnik 18 Dec 2022 12:34 UTC
  1 point
  0
  Parent
  Check out this article: https://www.lesswrong.com/posts/vCQNTuowPcnu6xqQN/distinguishing-test-from-training