I have sympathy with both one-boxers and two-boxers in Newcomb’s problem. Contrary to this, however, many people on Less Wrong seem to be staunch and confident one-boxers. So I’m turning to you guys to ask for help figuring out whether I should be a staunch one-boxer too. Below is an imaginary dialogue setting out my understanding of the arguments normally advanced on LW for one-boxing and I was hoping to get help filling in the details and extending this argument so that I (and anyone else who is uncertain about the issue) can develop an understanding of the strongest arguments for one-boxing.
One-boxer: You should one-box because one-boxing wins (that is, a person that one-boxes ends up better off than a person that two-boxes). Not only does it seem clear that rationality should be about winning generally (that a rational agent should not be systematically outperformed by irrational agents) but Newcomb’s problem is normally discussed within the context of instrumental rationality, which everyone agrees is about winning.
Me: I get that and that’s one of the main reasons I’m sympathetic to the one-boxing view but the two-boxers has a response to these concerns. The two-boxer agrees that rationality is about winning and they agree that winning means ending up with the most utility. The two-boxer should also agree that the rational decision theory to follow is one that will one-box on all future Newcomb’s problems (those where the prediction has not yet occurred) and can also agree that the best timeless agent type is a one-boxing type. However, the two-boxer also claims that two-boxing is the rational decision.
O: Sure, but why think they’re right? After all, two-boxers don’t win.
M: Okay, those with a two-boxing agent type don’t win but the two-boxer isn’t talking about agent types. They’re talking about decisions. So they are interested in what aspects of the agent’s winning can be attributed to their decision and they say that we can attribute the agent’s winning to their decision if this is caused by their decision. This strikes me as quite a reasonable way to apportion the credit for various parts of the winning. (Of course, it could be said that the two-boxer is right but they are playing a pointless game and should instead be interested in winning simpliciter rather than winning decisions. If this is the claim then the argument is dissolved and there is no disagreement. But I take it this is not the claim).
O: But this is a strange convoluted definition of winning. The agent ends up worse off than one-boxing agents so it must be a convoluted definition of winning that says that two-boxing is the winning decision.
M: Hmm, maybe… But I’m worried that relevant distinctions aren’t being made here (you’ve started talking about winning agents rather than winning decisions). The two-boxer relies on the same definition of winning as you and so agrees that the one-boxing agent is the winning agent. They just disagree about how to attribute winning to the agent’s decisions (rather than to other features of the agent). And their way of doing this strikes me as quite a natural one. We credit the decision with the winning that it causes. Is this the source of my unwillingness to jump fully on board with your program? Do we simply disagree about the plausibility of this way of attributing winning to decisions?
Meta-comment (a): I don’t know what to say here? Is this what’s going on? Do people just intuitively feel that this is a crazy way to attribute winning to decisions? If so, can anyone suggest why I should adopt the one-boxer perspective on this?
O: But then the two-boxer has to rely on the claim that Newcomb’s problem is “unfair” to explain why the two-boxing agent doesn’t win. It seems absurd to say that a scenario like Newcomb’s problem is unfair.
M: Well, the two-boxing agent means something very particular by “unfair”. They simply mean that in this case the winning agent doesn’t correspond to the winning decision. Further, they can explain why this is the case without saying anything that strikes me as crazy. They simply say that Newcomb’s problem is a case where the agent’s winnings can’t entirely be attributed to the agent’s decision (ignoring a constant value). But if something else (the agent’s type at time of prediction) also influences the agent’s winning in this case, why should it be a surprise that the winning agent and the winning decision come apart? I’m not saying the two-boxer is right here but they don’t seem to me to be obviously wrong either...
Meta-comment (b): Interested to know what response should be given here.
O: Okay, let’s try something else. The two-boxer focuses only on causal consequences but in doing so they simply ignore all the logical non-causal consequences of their decision algorithm outputting a certain decision. This is an ad hoc, unmotivated restriction.
M: Ah hoc? I’m not sure I see why. Think about the problem with evidential decision theory. The proponent of EDT could say a similar thing (that the proponent of two-boxing ignores all the evidential implications of their decision). The two-boxer will respond that these implications just are not relevant to decision making. When we make decisions we are trying to bring about the best results, not get evidence for these results. Equally, they might say, we are trying to bring about the best results, not derive the best results in our logical calculations. Now I don’t know what to make of the point/counter-point here but it doesn’t seem to me that the one-boxing view is obviously correct here and I’m worried that we’re again going to end up just trading intuitions (and I can see the force of both intuitions here).
Meta-comment: Again, I would love to know whether I’ve understood this argument and whether something can be said to convince me that the one-boxing view is the clear cut winner here.
End comments: That’s my understanding of the primary argument advanced for one-boxing on LW. Are there other core arguments? How can these arguments be improved and extended?