Thank you. This is exactly what I needed right now.
Eliezer, I hope you will take it as a form of high praise rather than insult that I stopped reading your article halfway through, typed this short comment, and am now going back to do some much-needed work.
(Hopefully I’ll get back to reading the rest later.)
It took me a week to think about it. Then I read all the comments, and thought about it some more. And now I think I have this “problem” well in hand. I also think that, incidentally, I arrived at Eliezer’s answer as well, though since he never spelled it out I can’t be sure.
To be clear—a lot of people have said that the decision depends on the problem parameters, so I’ll explain just what it is I’m solving. See, Eliezer wants our decision theory to WIN. That implies that we have all the relevant information—we can think of a lot of situations where we make the wisest decision possible based on available information and it turns out to be wrong; the universe is not fair, we know this already. So I will assume we have all the relevant information needed to win. We will also assume that Omega does have the capability to accurately predict my actions; and that causality is not violated (rationality cannot be expected to win if causality is violated!).
Assuming this, I can have a conversation with Omega before it leaves. Mind you, it’s not a real conversation, but having sufficient information about the problem means I can simulate its part of the conversation even if Omega itself refuses to participate and/or there isn’t enough time for such a conversation to take place. So it goes like this...
Me: “I do want to gain as much as possible in this problem. For that effect I will want you to put as much money in the box as possible. How do I do that?”
Omega: “I will put 1M$ in the box if you take only it; and nothing if you take both.”
Me: “Ah, but we’re not violating causality here, are we? That would be cheating!”
Omega: “True, causality is not violated. To rephrase, my decision on how much money to put in the box will depend on my prediction of what you will do. Since I have this capacity, we can consider these synonymous.”
Me: “Suppose I’m not convinced that they are truly synonymous. All right then. I intend to take only the one box”.
Omega: “Remember that I have the capability to predict your actions. As such I know if you are sincere or not.”
Me: “You got me. Alright, I’ll convince myself really hard to take only the one box.”
Omega: “Though you are sincere now, in the future you will reconsider this decision. As such, I will still place nothing in the box.”
Me: “And you are predicting all this from my current state, right? After all, this is one of the parameters in the problem—that after you’ve placed money in the boxes, you are gone and can’t come back to change it”.
Omega: “That is correct; I am predicting a future state from information on your current state”.
Me: “Aha! That means I do have a choice here, even before you have left. If I change my state so that I am unable or unwilling to two-box once you’ve left, then your prediction of my future “decision” will be different. In effect, I will be hardwired to one-box. And since I still want to retain my rationality, I will make sure that this hardwiring is strictly temporary.”
fiddling with my own brain a bit
Omega: “I have now determined that you are unwilling to take both boxes. As such, I will put the 1,000,000$ in the box.”
Omega departs
I walk unthinkingly toward the boxes and take just the one
Voila. Victory is achieved.
My main conclusion is here is that any decision theory that does not allow for changing strategies is a poor decision theory indeed. This IS essentially the Friendly AI problem: You can rationally one-box, but you need to have access to your own source code in order to do so. Not having that would so inflexible as to be the equivalent of an Iterative Prisoner’s Dilemma program that can only defect or only cooperate; that is, a very bad one.
The reason this is not obvious is that the way the problem is phrased is misleading. Omega supposedly leaves “before you make your choice”, but in fact there is not a single choice here (one-box or two-box). Rather, there are two decisions to be made, if you can modify your own thinking process:
Whether or not to have the ability and inclination to make decision #2 “rationally” once Omega has left, and
Whether to one-box or two-box.
...Where decision #1 can and should be made prior to Omega’s leaving, and obviously DOES influence what’s in the box. Decision #2 does not influence what’s in the box, but the state in which I approach that decision does. This is very confusing initially.
Now, I don’t really know CDT too well, but it seems to me that presented as these two decisions, even it would be able to correctly one-box on Newcomb’s problem. Am I wrong?
Eliezer—if you are still reading these comments so long after the article was published—I don’t think it’s an inconsistency in the AI’s decision making if the AI’s decision making is influenced by its internal state. In fact I expect that to be the case. What am I missing here?