An embedded agent should be able to reason accurately about its own origins. But AIXI-style definitions via argmax create agents that, if they reason correctly about selection processes, should conclude they’re vanishingly unlikely to exist.
Consider an agent reasoning: “What kind of process could have produced me?” If the agent is literally the argmax of some simple scoring function, then the selection process must have enumerated all possible agents, evaluated f on each, and picked the maximum. This is physically unrealizable: it requires resources exceeding what’s available in the environment. So the agent concludes that it wasn’t generated by the argmax.
The following seems like sound reasoning for an embedded agent: “I am a messy physical system in a messy universe, generated by a messy process. It is unlikely that my behavior is a clean mathematical function generated by argmaxing another clean mathematical function.”
Yet for Embedded-AIXI defined via argmax, this reasoning is fallacious. This is a very handwavy obstacle for expecting an AIXI-style definition of embedded agency.
Another gloss: we can’t define what it means for an embedded agent to be “ideal” because embedded agents are messy physical systems, and messy physical systems are never ideal. At most they’re “good enough”. So we should only hope to define when an embedded agent is good enough. Moreover, such agents must be generated by a physically realistic selection process.
This motivates Wentworth’s (mostly abandoned) project of Selection Theorems, i.e. studying physically realistic generators of good enough embedded agents.
Consider an agent reasoning: “What kind of process could have produced me?” If the agent is literally the argmax of some simple scoring function, then the selection process must have enumerated all possible agents, evaluated f on each, and picked the maximum. This is physically unrealizable: it requires resources exceeding what’s available in the environment. So the agent concludes that it wasn’t generated by the argmax.
This is the invalid step of reasoning, because for AIXI agents, the environment is allowed to have unlimited resources/be very complicated by construction, and you can have environments which do allow you to do the literal search procedure.
This is why AIXI is usually considered in an unbounded setting, where we give AIXI unlimited resources for memory and time like a Universal Turing Machine, and is given certain oracular powers to make it possible to actually use AIXI to do inference or planning.
You underestimate how complicated and resource-rich environments are allowed to be.
Another gloss: we can’t define what it means for an embedded agent to be “ideal” because embedded agents are messy physical systems, and messy physical systems are never ideal. At most they’re “good enough”. So we should only hope to define when an embedded agent is good enough. Moreover, such agents must be generated by a physically realistic selection process.
This is very dependent on what the rules of the environment are, and embedded agents can be ideal in certain environments.
we can’t define what it means for an embedded agent to be “ideal” because embedded agents are messy physical systems, and messy physical systems are never ideal
Thus some kind of theory vs. instantiation distinction is necessary. An embedded agent can think about pi using a biological brain based on chemical signaling. A physical calculator instantiates abstract arithmetic. A convergent move in decision theory around embedded agency seems to be that the agent must be fundamentally an abstract computation thing outside of the world, while what’s embedded is some sort of messy instance approximation/reasoning system that attempts to convey abstract agent’s influence upon the environment.
The abstract agent must remain sufficiently legible for the world to contain things that are able to usefully reason about it and convey its decisions, this is one issue with literal Solomonoff induction. But for some ideal argmax decision maker, it’s still possible for the messy in-world instances to reason about what would approximate it better.
Can we define Embedded Agent like we define AIXI?
An embedded agent should be able to reason accurately about its own origins. But AIXI-style definitions via argmax create agents that, if they reason correctly about selection processes, should conclude they’re vanishingly unlikely to exist.
Consider an agent reasoning: “What kind of process could have produced me?” If the agent is literally the argmax of some simple scoring function, then the selection process must have enumerated all possible agents, evaluated f on each, and picked the maximum. This is physically unrealizable: it requires resources exceeding what’s available in the environment. So the agent concludes that it wasn’t generated by the argmax.
The following seems like sound reasoning for an embedded agent: “I am a messy physical system in a messy universe, generated by a messy process. It is unlikely that my behavior is a clean mathematical function generated by argmaxing another clean mathematical function.”
Yet for Embedded-AIXI defined via argmax, this reasoning is fallacious. This is a very handwavy obstacle for expecting an AIXI-style definition of embedded agency.
Another gloss: we can’t define what it means for an embedded agent to be “ideal” because embedded agents are messy physical systems, and messy physical systems are never ideal. At most they’re “good enough”. So we should only hope to define when an embedded agent is good enough. Moreover, such agents must be generated by a physically realistic selection process.
This motivates Wentworth’s (mostly abandoned) project of Selection Theorems, i.e. studying physically realistic generators of good enough embedded agents.
By AIXI-style I mean: we have some space of agents X, a real-valued scoring function f on X, and define the ideal agent as the argmax of f.
This is the invalid step of reasoning, because for AIXI agents, the environment is allowed to have unlimited resources/be very complicated by construction, and you can have environments which do allow you to do the literal search procedure.
This is why AIXI is usually considered in an unbounded setting, where we give AIXI unlimited resources for memory and time like a Universal Turing Machine, and is given certain oracular powers to make it possible to actually use AIXI to do inference or planning.
You underestimate how complicated and resource-rich environments are allowed to be.
This is very dependent on what the rules of the environment are, and embedded agents can be ideal in certain environments.
Thus some kind of theory vs. instantiation distinction is necessary. An embedded agent can think about pi using a biological brain based on chemical signaling. A physical calculator instantiates abstract arithmetic. A convergent move in decision theory around embedded agency seems to be that the agent must be fundamentally an abstract computation thing outside of the world, while what’s embedded is some sort of messy instance approximation/reasoning system that attempts to convey abstract agent’s influence upon the environment.
The abstract agent must remain sufficiently legible for the world to contain things that are able to usefully reason about it and convey its decisions, this is one issue with literal Solomonoff induction. But for some ideal argmax decision maker, it’s still possible for the messy in-world instances to reason about what would approximate it better.