If the problem isn’t solved for AIXI, then I don’t see why it would be solved for bounded agents. Solomonoff induction doesn’t have Solomonoff induction in its hypothesis class, and naive bounded Solomonoff induction doesn’t have naive bounded Solomonoff induction in its hypothesis class.
The solution you propose looks something like: to predict X property of the world, just learn to predict X directly instead of learning a whole model of the world containing the agent. (Let me know if this in an inaccurate summary). This essentially “marginalizes over the agent” by ignoring the agent. While this isn’t a completely bad way to predict X, it seems like this won’t succeed in creating an integrated world model. If there’s some property of the agent’s source code that affects X, then the agent won’t be able to determine this before actually running that branch of the source code. I think a lot of the logical uncertainty / naturalized induction problem is to create an integrated world model that can reason about the agent itself when this is useful to predict X.
Reflective oracles solve a large part of the unbounded problem, but they don’t have an obvious bounded analogue. For example, when an agent using reflective oracles is predicting a smarter agent, it is already able to see that smarter agent’s distribution over actions. This doesn’t work in bounded land; you probably need some other ingredient. Perhaps the solution acts like reflective oracles in the limit, but it needs a story for why a weak agent can reason about a smart agent, which reflective oracles don’t provide.
I think that Vadim’s optimal predictors can play the same role as reflective oracles in the bounded case, or at least that’s the idea.
Both of them are just analysis tools though, not algorithms. I think the corresponding algorithms will be closer to what Daniel describes, that is the agent does not treat itself specially (except for correlation between its decision and the agent’s output).
Thanks Jessica. This was helpful, and I think I see more what the problem is.
Re point 1: I see what you mean. The intuition behind my post is that it seems like it should be possible to make a bounded system that can eventually come to hold any computable hypothesis given enough evidence, including a hypothesis including a model of itself of arbitrary precision (which is different from Solomonoff, which can clearly never think about systems like itself). It’s clearly not possible for the system to hold and update infinitely many hypotheses the way Solomonoff does, and a system would need some kind of logical uncertainty or other magic to evaluate complex or self-referential hypotheses, but it seems like these hypotheses should be “in its class”. Does this make sense, or do you think there is a mistake there?
Re point 2: I’m not confident that’s an accurate summary; I’m precisely proposing that the agent learn a model of the world containing a model of the agent (approximate or precise). I agree that evaluating this kind of model will require logical uncertainty or similar magic, since it will be expensive and possibly self-referential.
Re point 3: I see what you mean, though for self-modeling the agent being predicted should only be as smart as the agent doing the prediction. It seems like approximation and logical uncertainty are the main ingredients needed here. Are there particular parts of the unbounded problem that are not solved by reflective oracles?
Re point 1: Suppose the agent considers all hypotheses of length up to l bits that run in up to t time. Then the agent takes 2lt time to run. For an individual hypothesis to reason about the agent, it must use t computation time to reason about a computation of size 2lt. A theoretical understanding of how this works would solve a large part of the logical uncertainty / naturalized induction / Vingean reflection problem.
Maybe it’s possible for this to work without having a theoretical understanding of why it works, but the theoretical understanding is useful too (it seems like you agree with this). I think there are some indications that naive solutions won’t automatically work; see e.g. this post.
Re point 2: It seems like this is learning a model from the state and action to state, and a model from state to state that ignores the agent. But it isn’t learning a model that e.g. reasons about the agent’s source code to predict the next state. An integrated model should be able to do reasoning like this.
Re point 3: I think you still have a Vingean reflection problem if a hypothesis that runs in t time predicts a computation of size 2lt. Reflective Solomonoff induction solves a problem with an unrealistic computation model, and doesn’t translate to a solution with a finite (but large) amount of computing resources. The main part not solved is the general issue of predicting aspects of large computations using a small amount of computing power.
Thanks. I agree that these are problems. It seems to me that the root of these problems is logical uncertainty / vingean reflection (which seem like two sides of the same coin); I find myself less confused when I think about self-modeling as being basically an application of “figuring out how to think about big / self-like hypotheses”. Is that how you think of it, or are there aspects of the problem that you think are missed by this framing?
A few points:
If the problem isn’t solved for AIXI, then I don’t see why it would be solved for bounded agents. Solomonoff induction doesn’t have Solomonoff induction in its hypothesis class, and naive bounded Solomonoff induction doesn’t have naive bounded Solomonoff induction in its hypothesis class.
The solution you propose looks something like: to predict X property of the world, just learn to predict X directly instead of learning a whole model of the world containing the agent. (Let me know if this in an inaccurate summary). This essentially “marginalizes over the agent” by ignoring the agent. While this isn’t a completely bad way to predict X, it seems like this won’t succeed in creating an integrated world model. If there’s some property of the agent’s source code that affects X, then the agent won’t be able to determine this before actually running that branch of the source code. I think a lot of the logical uncertainty / naturalized induction problem is to create an integrated world model that can reason about the agent itself when this is useful to predict X.
Reflective oracles solve a large part of the unbounded problem, but they don’t have an obvious bounded analogue. For example, when an agent using reflective oracles is predicting a smarter agent, it is already able to see that smarter agent’s distribution over actions. This doesn’t work in bounded land; you probably need some other ingredient. Perhaps the solution acts like reflective oracles in the limit, but it needs a story for why a weak agent can reason about a smart agent, which reflective oracles don’t provide.
I think that Vadim’s optimal predictors can play the same role as reflective oracles in the bounded case, or at least that’s the idea.
Both of them are just analysis tools though, not algorithms. I think the corresponding algorithms will be closer to what Daniel describes, that is the agent does not treat itself specially (except for correlation between its decision and the agent’s output).
Thanks Jessica. This was helpful, and I think I see more what the problem is.
Re point 1: I see what you mean. The intuition behind my post is that it seems like it should be possible to make a bounded system that can eventually come to hold any computable hypothesis given enough evidence, including a hypothesis including a model of itself of arbitrary precision (which is different from Solomonoff, which can clearly never think about systems like itself). It’s clearly not possible for the system to hold and update infinitely many hypotheses the way Solomonoff does, and a system would need some kind of logical uncertainty or other magic to evaluate complex or self-referential hypotheses, but it seems like these hypotheses should be “in its class”. Does this make sense, or do you think there is a mistake there?
Re point 2: I’m not confident that’s an accurate summary; I’m precisely proposing that the agent learn a model of the world containing a model of the agent (approximate or precise). I agree that evaluating this kind of model will require logical uncertainty or similar magic, since it will be expensive and possibly self-referential.
Re point 3: I see what you mean, though for self-modeling the agent being predicted should only be as smart as the agent doing the prediction. It seems like approximation and logical uncertainty are the main ingredients needed here. Are there particular parts of the unbounded problem that are not solved by reflective oracles?
Re point 1: Suppose the agent considers all hypotheses of length up to l bits that run in up to t time. Then the agent takes 2lt time to run. For an individual hypothesis to reason about the agent, it must use t computation time to reason about a computation of size 2lt. A theoretical understanding of how this works would solve a large part of the logical uncertainty / naturalized induction / Vingean reflection problem.
Maybe it’s possible for this to work without having a theoretical understanding of why it works, but the theoretical understanding is useful too (it seems like you agree with this). I think there are some indications that naive solutions won’t automatically work; see e.g. this post.
Re point 2: It seems like this is learning a model from the state and action to state, and a model from state to state that ignores the agent. But it isn’t learning a model that e.g. reasons about the agent’s source code to predict the next state. An integrated model should be able to do reasoning like this.
Re point 3: I think you still have a Vingean reflection problem if a hypothesis that runs in t time predicts a computation of size 2lt. Reflective Solomonoff induction solves a problem with an unrealistic computation model, and doesn’t translate to a solution with a finite (but large) amount of computing resources. The main part not solved is the general issue of predicting aspects of large computations using a small amount of computing power.
Thanks. I agree that these are problems. It seems to me that the root of these problems is logical uncertainty / vingean reflection (which seem like two sides of the same coin); I find myself less confused when I think about self-modeling as being basically an application of “figuring out how to think about big / self-like hypotheses”. Is that how you think of it, or are there aspects of the problem that you think are missed by this framing?
Yes, this is also how I think about it. I don’t know anything specific that doesn’t fit into this framing.