The limited predictor problem

This post requires some knowledge of logic, computability theory, and K-complexity. Much of the credit goes to Wei Dai. The four sections of the post can be read almost independently.

The limited predictor problem (LPP) is a version of Newcomb’s Problem where the predictor has limited computing resources. To predict the agent’s action, the predictor simulates the agent for N steps. If the agent doesn’t finish in N steps, the predictor assumes that the agent will two-box. LPP is similar to the ASP problem, but with simulation instead of theorem proving.

1. Solving the problem when the agent has a halting oracle

Consider the agent defined in “A model of UDT with a halting oracle”, and a predictor that can run the agent’s code step-by-step, with oracle calls and all. Turns out that this agent solves LPP correctly if N is high enough. To understand why, note that the agent offloads all interesting work to oracles that return instantly, so the agent’s own runtime is provably bounded. If that bound is below N, the agent’s oracle will prove that the predictor predicts the agent correctly, so the agent will one-box.

2. Failing to solve the problem when N is algorithmically random

Consider a setting without oracles, with only Turing-computable programs. Maybe the agent should successively search for proofs somehow?

Unfortunately you can’t solve most LPPs this way, for a simple but surprising reason. Assume that the predictor’s time limit N is a large and algorithmically random number. Then the predictor’s source code is >log(N) bits long, because N must be defined in the source code. Then any proof about the world program must also have length >log(N), because the proof needs to at least quote the world program itself. Finding a proof by exhaustive search takes exponential time, so the agent will need >N steps. But the predictor simulates the agent for only N steps. Whoops!

3. Solving the problem when N is large but has a short definition

As usual, let U be the world program that returns a utility value, and A be the agent program that returns an action and has access to the world’s source code. Consider the following algorithm for A:

  1. From L=1 to infinity, search for proofs up to length L of the form “if A()=a and runtime(A)<g(L), then U()=u”, where g(L) is an upper bound on runtime(A) if A stops the search at length L. Upon finding at least one proof for each possible a, go to step 2.

  2. Search for proofs up to length f(L) of the form “if runtime(A)<g(L), then A()≠a”, where f(L) is some suitably fast-growing function like 10^L. If such a proof is found, return a.

  3. If we’re still here, return the best a found on step 1.

This algorithm is very similar to the one described in “A model of UDT without proof limits”, but with the added complication that A is aware of its own runtime via the function g(L). By an analogous argument, A will find the “intended” proof that the predictor predicts A correctly if runtime(A) is small enough, as long as the “intended” proof exists and isn’t too long relative to the predictor’s time limit N. More concretely, A will solve all instances of LPP in which N is larger than g(L), where L is the length of the “intended” proof. For example, if f(L)=10^L, then g(L) is doubly exponential, so A will successfully solve LPPs where the predictor’s source code defines N using triple exponentials or some more compact notation.

4. A broader view

TDT and UDT were originally designed for solving “decision-determined” problems. The agent figures out how the resulting utility logically depends on the agent’s action, then returns the action with the highest utility, thus making the premise true.

But a cleverly coded decision program can also control other facts about itself. For example, the program may figure out how the resulting utility depends on the program’s return value and running time, then choose the best return value and choose how long to keep running, thus making both premises true. This idea is a natural extension of quining (you carefully write a program that can correctly judge its own runtime so far) and can be generalized to memory consumption and other properties of programs.

With enough cleverness we could write a program that would sometimes decide to waste time, or run for an even number of clock cycles, etc. We did not need so much cleverness in this post because LPP lies in a smaller class that we may call “LPP-like problems”, where utility depends only on the agent’s return value and runtime, and the dependence on runtime is monotonous—it never hurts to return the same value earlier. That class also includes all the usual decision-determined problems like Newcomb’s Problem, and our A also fares well on those.

I was surprised to find so many new ideas by digging into such a trivial-looking problem as LPP. This makes me suspect that advanced problems like ASP may conceal even more riches, if only we have enough patience to approach them properly...