One really nice thing about the approach is that we do not have to identify the agent within the universe. All that is needed is the logical implications of supposing that the agent takes various actions. As a result, the agent will treat programs provably equivalent to its own as copies of itself with no fuss.
We can take a prior on all programs as our universe, with no special interface between the agent and the program as in AIXI. (We do need to somehow specify utilities on programs, which is a non-obvious procedure.) The agent then searches for proofs that if it takes some particular action, then at least some particular expected utility is achieved. There are no longer finitely many possible outcomes, so modal tricks to do this without a halting oracle won’t work. Instead, the agent must be doing this for some bounded time. It then takes the action with highest proven utility.
One really nice thing about the approach is that we do not have to identify the agent within the universe. All that is needed is the logical implications of supposing that the agent takes various actions. As a result, the agent will treat programs provably equivalent to its own as copies of itself with no fuss.
We can take a prior on all programs as our universe, with no special interface between the agent and the program as in AIXI. (We do need to somehow specify utilities on programs, which is a non-obvious procedure.) The agent then searches for proofs that if it takes some particular action, then at least some particular expected utility is achieved. There are no longer finitely many possible outcomes, so modal tricks to do this without a halting oracle won’t work. Instead, the agent must be doing this for some bounded time. It then takes the action with highest proven utility.