Thanks for twisting my mind in the right direction with the S’ stuff. I hereby submit the following ridiculous but rigorous theory of Newcomblike problems:
You submit a program that outputs a row number in a payoff matrix, and a “world program” simultaneously outputs a column number in the same matrix; together they determine your payoff. Your program receives the source code of the world program as an argument. The world program doesn’t receive your source code, but it contains some opaque function calls to an “oracle” that’s guaranteed to return your future output. For example, in Newcomb’s Problem the world decides to put $1M in the big box iff the oracle says you will one-box.
You have no way to simulate the world and cause paradoxes, so any run of this game will be consistent. Your only recourse is “conditional simulation”: for each of your possible choices, substitute it in place of the oracle call and simulate the world under this assumption, then pick the best option. When applied to Newcomb’s Problem, this general algorithm leads to one-boxing. Note there’s no infinite recursion involved on either side: your program doesn’t ever attempt to simulate the oracle because it’s opaque. And the final touch: with infinite recursion thus banished, the oracle can actually be implemented as an ordinary simulator that obtains your source code by peeking through some backdoor in the tournament setting.
This formalization looks totally obvious in retrospect and captures a lot of my common-sense intuitions about Newcomb’s. I wonder why people didn’t mention it earlier.
I don’t think I get your point. Apparently the purpose of having an “oracle” is to ensure that
You have no way to simulate the world and cause paradoxes, so any run of this game will be consistent.
What paradoxes do you mean? If we replace the “oracle” with the ordinary simulator right from the beginning, what paradoxes occur? According to the decision theory proposed in this post, S would see that it is called twice by the world program, once inside the simulator and once “for real”, and compute that the output “one-box” maximizes its utility, and that’s the end of it.
The “oracle” helps make the problem tractable: a) it prevents other, non-optimal programs from naively trying to simulate the world and going into infinite recursion; b) it makes the general solution algorithm implementable by unambiguously identifying the spots in the world program that are are actually “oracle” invocations, which would be impossible otherwise (Rice’s theorem).
I don’t really get the point of “decision theories”, so try to reduce all similar problems to “algorithmic game theory” (is that an existing area?).
Edited to add: I couldn’t make up a rigorous game-theoretic formulation without an oracle.
Why worry about non-optimal programs? We’re talking about a theory of how AIs should make decisions, right?
I think it’s impossible for an AI to avoid the need to determine non-trivial properties of other programs, even though Rice’s Theorem says there is no algorithm for doing this that’s guaranteed to work in general. It just has to use methods that sometimes return wrong answers. And to deal with that, it needs a way to handle mathematical uncertainty.
ETA: If formalizing the problem is a non-trivial process, you might be solving most of the problem yourself in there, rather than letting the AI’s decision algorithm solve it. I don’t think you’d want that. In this case, for example, if your AI were to encounter Omega in real life, how would it know to model the situation using a world program that invokes a special kind of oracle?
Re ETA: in the comments to Formalizing Newcomb’s, Eliezer effectively said he prefers the “special kind of oracle” interpretation to the simulator interpretation. I’m not sure which one an AI should assume when Omega gives it a verbal description of the problem.
Yes, I meant that. Maybe I misinterpreted you; maybe the game needs to be restated with a probabilistic oracle :-) Because I’m a mental cripple and can’t go far without a mathy model.
Thanks for twisting my mind in the right direction with the S’ stuff. I hereby submit the following ridiculous but rigorous theory of Newcomblike problems:
You submit a program that outputs a row number in a payoff matrix, and a “world program” simultaneously outputs a column number in the same matrix; together they determine your payoff. Your program receives the source code of the world program as an argument. The world program doesn’t receive your source code, but it contains some opaque function calls to an “oracle” that’s guaranteed to return your future output. For example, in Newcomb’s Problem the world decides to put $1M in the big box iff the oracle says you will one-box.
You have no way to simulate the world and cause paradoxes, so any run of this game will be consistent. Your only recourse is “conditional simulation”: for each of your possible choices, substitute it in place of the oracle call and simulate the world under this assumption, then pick the best option. When applied to Newcomb’s Problem, this general algorithm leads to one-boxing. Note there’s no infinite recursion involved on either side: your program doesn’t ever attempt to simulate the oracle because it’s opaque. And the final touch: with infinite recursion thus banished, the oracle can actually be implemented as an ordinary simulator that obtains your source code by peeking through some backdoor in the tournament setting.
This formalization looks totally obvious in retrospect and captures a lot of my common-sense intuitions about Newcomb’s. I wonder why people didn’t mention it earlier.
I don’t think I get your point. Apparently the purpose of having an “oracle” is to ensure that
What paradoxes do you mean? If we replace the “oracle” with the ordinary simulator right from the beginning, what paradoxes occur? According to the decision theory proposed in this post, S would see that it is called twice by the world program, once inside the simulator and once “for real”, and compute that the output “one-box” maximizes its utility, and that’s the end of it.
The “oracle” helps make the problem tractable: a) it prevents other, non-optimal programs from naively trying to simulate the world and going into infinite recursion; b) it makes the general solution algorithm implementable by unambiguously identifying the spots in the world program that are are actually “oracle” invocations, which would be impossible otherwise (Rice’s theorem).
I don’t really get the point of “decision theories”, so try to reduce all similar problems to “algorithmic game theory” (is that an existing area?).
Edited to add: I couldn’t make up a rigorous game-theoretic formulation without an oracle.
Why worry about non-optimal programs? We’re talking about a theory of how AIs should make decisions, right?
I think it’s impossible for an AI to avoid the need to determine non-trivial properties of other programs, even though Rice’s Theorem says there is no algorithm for doing this that’s guaranteed to work in general. It just has to use methods that sometimes return wrong answers. And to deal with that, it needs a way to handle mathematical uncertainty.
ETA: If formalizing the problem is a non-trivial process, you might be solving most of the problem yourself in there, rather than letting the AI’s decision algorithm solve it. I don’t think you’d want that. In this case, for example, if your AI were to encounter Omega in real life, how would it know to model the situation using a world program that invokes a special kind of oracle?
Re ETA: in the comments to Formalizing Newcomb’s, Eliezer effectively said he prefers the “special kind of oracle” interpretation to the simulator interpretation. I’m not sure which one an AI should assume when Omega gives it a verbal description of the problem.
Wha?
If you mean my saying (3), that doesn’t mean “Oracle”, it means we reason about the program without doing a full simulation of it.
Yes, I meant that. Maybe I misinterpreted you; maybe the game needs to be restated with a probabilistic oracle :-) Because I’m a mental cripple and can’t go far without a mathy model.