# Error

LW server reports: not allowed.

This probably means the post has been deleted or moved back to the author's drafts.

• This is not the halting problem. The halting problem is about existence of an algorithm that predicts all algorithms. Here, we are mostly interested in predicting Agent and Predictor.

However, the standard argument about the halting problem can be applied to this case, giving interesting constructions. Agent can decide to diagonalize Predictor, to simulate it and then say the opposite, making it impossible for Predictor to predict it. Similarly, Agent can decide to diagonalize itself (which is called having a chicken rule), by committing to do the opposite of whatever it predicts itself to do, if that ever happens. This makes it impossible for Agent to predict what it’s going to do. (The details of correctly implementing the chicken rule are harder to formalize in the absence of oracles.)

It might seem that some questions similar to the halting problem are “unfair”: if Predictor sees that Agent is diagonalizing it, it understands Agent’s behavior fully, yet it can’t express its understanding, as it must state a particular action Agent is going to perform, which is impossible. What if instead Predictor states a dependence of Agent’s action on Predictor’s prediction? But then Agent might be able to diagonalize that as well. One solution is for Predictor to avoid saying or thinking its prediction in a way legible to Agent, and standard Newcomb’s problem tries to do exactly that by keeping boxes opaque and Predictor’s algorithm unknown.

When Predictor is not an oracle and its algorithm is known to Agent, we might simply require that it fills the big box with money only if it succeeds in predicting that Agent one-boxes. In that case, Agent has an incentive to be predictable. If a decision theory prescribes behavior that makes it impossible for a particular Predictor to predict one-boxing, then Agent following that decision theory goes home without the money. One example is the Agent Simulates Predictor (ASP) problem, where Predictor performs only a bounded computation, and Agent can fully simulate Predictor’s reasoning, which tempts it into two-boxing if Predictor is seen to predict one-boxing. Another example is Transparent Newcomb’s problem, where Agent is allowed to observe Predictor’s prediction directly (by seeing the contents of the boxes before making a decision).

• I do not see the connection. The gist of Newcomb’s Problem does not change if the player is given a time limit (you have to choose within an hour, or you do not get anything). Time-limited halting problem is of course trivially decidable.

• Are you claiming that the problem arises when the agent tries to predict its own behavior, or when the predictor tries to predict the agent’s behavior? Either way, I don’t think this makes Newcomb incoherent. Even if the agent can’t solve the halting problem in general, there are programs that can solve it in specific cases, including for themselves. And the predictor can be assumed to have greater computational resources than the agent, e.g. it can run for longer, or has a halting oracle if you really want the type of the agent to be ‘general Turing machine’, which means it can avoid self-reference paradoxes.

• We could require that both the agent and the predictor are machines that always halt.

However to think about Newcomb’s problem entails “casting yourself” as the agent and predictor both, with a theoretically unlimited amount of time to consider strategies for the agent to defeat the predictor, as well as for the predictor to defeat the agent.

That’s just shifting the goalposts. Now you are predicting the behavior of both the agent and predictor. If you could create an agent capable of defeating the predictor, you’d have to adjust the predictor. If you could create a predictor capable of defeating the agent, you’d have to resume proposing strategies for the agent.

You are now the machine trying to simulate your own future behavior with how you modify the agent and predictor. And there is no requirement that you take a finite amount of time, or use a finite amount of computing power, when considering the problem. For example, the problem does not say “come up with the best strategy for the agent and predictor you can within X minutes.”

Hence, we have two equally uninteresting cases:

• The agent/​thinker are limited in the time or computational resources available to them, while the predictor is unlimited.

• The agent/​thinker and predictor are both unlimited in time and computational resources, and both must be continuously and forever modified to try and defeat each others’ strategies. They are leapfrogging up the oracle hierarchy, forever. Newcomb’s problem invites you to try and compute where they’ll end up, and the answer is undecidable, a loop.

• However to think about Newcomb’s problem entails “casting yourself” as the agent and predictor both, with a theoretically unlimited amount of time to consider strategies for the agent to defeat the predictor, as well as for the predictor to defeat the agent.

I don’t think so. Newcomb’s problem is meant to be a simple situation where an agent must act in an environment more computationally powerful than itself. The perspective is very much meant to be that of the agent. If you think that figuring out how to act in an environment more powerful than yourself is uninteresting, you must be pretty bored, since that describes the situation all of us find ourselves in.

• The agent/​thinker are limited in the time or computational resources available to them, while the predictor is unlimited.

My understanding is that this is generally situation which is meant. Well, not necessarily unlimited, just with enough resources to predict the behavior of the agent.

I don’t see why you call this situation uninteresting.

• That something can be modeled using some Turing machine, doesn’t imply that it can be any Turing machine.

If I have some simple physical system, such that I can predict how it will behave, well, it can be modeled by a Turing machine, but me being able to predict it doesn’t imply that I’ve solved the halting problem.

A realistic conception of agents in an environment doesn’t involve all agents having unlimited compute at every time-step. An agent cannot prevent the universe from continuing simply by getting stuck in a loop and never producing its output for its next action.

• Newcomb WOULD in fact take forever, if both omega and agent were perfect calculators of each other and contained the full source to each other (not the halting problem, but related). Omega couldn’t make the offer until it had simulated the agent, including the simulated agent fully simulating Omega, recursively to infinity.

But most formulations don’t require this fidelity. Even if Omega is quite inaccurate, as long as it’s better at predicting the agent than the agent is of Omega, the problem still works.

• It’s possible for two programs to know each other’s code and to perfectly deduce each other’s result without taking forever, they just can’t do it by simulating each other. But they can do it by formal reasoning about each other, if it happens to be sufficiently easy and neither is preventing the other from predicting it. The issues here are not about fidelity of prediction.