Open-Box Newcomb’s Problem and the limitations of the Erasure framing

One of the most con­fus­ing as­pects of the Era­sure Ap­proach to New­comb’s prob­lem is that in Open-Box New­comb’s it re­quires you to for­get that you’ve seen that the box is full. This is re­ally a strange thing to do so it de­serves fur­ther ex­pla­na­tion. And as we’ll see, this might not be the best way to think about what it hap­pen­ing.

Let’s be­gin by re­cap­ping the prob­lem. In a room there are two boxes, with one-con­tain­ing $1000 and the other be­ing a trans­par­ent box that con­tains ei­ther noth­ing or $1 mil­lion. Be­fore you en­tered the room, a perfect pre­dic­tor pre­dicted what you would do if you saw $1 mil­lion in the trans­par­ent box. If it pre­dicted that you would one-boxed, then it put $1 mil­lion in the trans­par­ent box, oth­er­wise it left the box empty. If you can see $1 mil­lion in the trans­par­ent box, which choice should you pick?

The ar­gu­ment I pro­vided be­fore was as fol­lows: If you see a full box, then you must be go­ing to one-box if the pre­dic­tor re­ally is perfect. So there would only be one de­ci­sion con­sis­tent with the prob­lem de­scrip­tion and to pro­duce a non-triv­ial de­ci­sion the­ory prob­lem we’d have to erase some in­for­ma­tion. And the most log­i­cal thing to erase would be what you see in the box.

I still mostly agree with this ar­gu­ment, but I feel the rea­son­ing is a bit sparse, so this post will try to break it down in more de­tail. I’ll just note in ad­vance that when you start break­ing it down, you end up perform­ing a kind of psy­cholog­i­cal or so­cial anal­y­sis. How­ever, I think this is in­evitable when deal­ing with am­bigu­ous prob­lems; if you could provide a math­e­mat­i­cal proof of what an am­bigu­ous prob­lem meant then it wouldn’t be am­bigu­ous.

As I noted in De­con­fus­ing Log­i­cal Coun­ter­fac­tu­als, there is only one choice con­sis­tent with the prob­lem (one-box­ing), so in or­der to an­swer this ques­tion we’ll have to con­struct some coun­ter­fac­tu­als. A good way to view this is that in­stead of ask­ing what choice should the agent make, we will ask whether the agent made the best choice.

Now, in or­der to con­struct these coun­ter­fac­tu­als we’ll have to con­sider situ­a­tions with at least one of the above as­sump­tions miss­ing. Now we want to con­sider coun­ter­fac­tu­als in­volv­ing both one-box­ing and two-box­ing. Un­for­tu­nately, it is im­pos­si­ble for a two-boxer to a) see $1 mil­lion in a box if b) the money is only in the box if the pre­dic­tor pre­dicts the agent will one-box in this situ­a­tion and c) the pre­dic­tor is perfect. So we’ll have to re­lax at least one of these as­sump­tions.

Speak­ing very roughly, it is gen­er­ally un­der­stood that the way to re­solve this is to re­lax the as­sump­tion that the agent must re­ally be in that situ­a­tion and to al­low the pos­si­bil­ity that the agent may only be simu­lated as be­ing in such as situ­a­tion by the pre­dic­tor. I want to re­it­er­ate that what counts as the same prob­lem is re­ally just a mat­ter of so­cial con­ven­tion.

Another note: I said speak­ing very roughly be­cause some peo­ple claim that the agent could ac­tu­ally be in the simu­la­tion. In my mind these peo­ple are con­fused; in or­der to pre­dict an agent, we may only need to simu­late the de­ci­sion the­ory parts of its mind, not all the other parts that make you you. A sec­ond rea­son why this isn’t pre­cise is be­cause it isn’t defined how to simu­late an im­pos­si­ble situ­a­tion; one of my pre­vi­ous posts points out that we can get around this by simu­lat­ing what an agent would do when given in­put rep­re­sent­ing an im­pos­si­ble situ­a­tion. There may also be some peo­ple have doubts about whether a perfect pre­dic­tor is pos­si­ble even in the­ory. I’d sug­gest that these peo­ple read one of my past posts on why the sense in which you “could have cho­sen oth­er­wise” doesn’t break the pre­dic­tion and how there’s a sense that you are pre-com­mited to ev­ery ac­tion you take.

In any case, once we have re­laxed this as­sump­tion, the con­sis­tent coun­ter­fac­tu­als be­come ei­ther a) the agent ac­tu­ally see­ing the full box and one-box­ing b) the agent see­ing the empty box. In case b), it is ac­tu­ally con­sis­tent for the agent to one-box or two-box since the pre­dic­tor only pre­dicts what would hap­pen if the agent saw a full box. It is then triv­ial to pick the best coun­ter­fac­tual.

This prob­lem ac­tu­ally demon­strates a limi­ta­tion of the era­sure fram­ing. After all, we didn’t just jus­tify the coun­ter­fac­tu­als by re­mov­ing the as­sump­tion that you saw a full box; we in­stead mod­ified it to see­ing a full box OR be­ing simu­lated see­ing a full box. In one sense, this is es­sen­tially the same thing—since we already knew you were be­ing simu­lated by the pre­dic­tor, we es­sen­tially just re­moved the as­sump­tion. On the other hand, it is eas­ier to jus­tify that it is the same prob­lem by turn­ing it into an OR than by just re­mov­ing the as­sump­tion.

In other words, think­ing about coun­ter­fac­tu­als in terms of era­sure can be in­cred­ibly mis­lead­ing and in this case ac­tively made it harder jus­tify our coun­ter­fac­tu­als. The key ques­tion seems to be not, “What should I erase?”, but, “What as­sump­tion should I erase or re­lax?”. I’m be­gin­ning to think that I’ll need to choose a bet­ter term, but I re­luc­tant to re­name this ap­proach un­til I have a bet­ter un­der­stand­ing of what ex­actly is go­ing on.

At risk of re­peat­ing my­self, the fact that it is nat­u­ral to re­lax this as­sump­tion is a mat­ter of so­cial con­ven­tion and not math­e­mat­ics. My next post on this topic will try to help clar­ify how cer­tain as­pects of a prob­lem may make it seem nat­u­ral to re­lax or re­move cer­tain as­sump­tions.