The Counterfactual Prisoner’s Dilemma

Up­date­less de­ci­sion the­ory asks us to make de­ci­sions by imag­in­ing what we would have pre-com­mit­ted to ahead of time. There’s only one prob­lem—we didn’t com­mit to it ahead of time. So we do we care about what would have hap­pened if we had?

This isn’t a prob­lem for the stan­dard New­comb’s prob­lems. Even if we haven’t for­mally pre-com­mit­ted to an ac­tion such as by set­ting up con­se­quences for failure, we are effec­tively pre-com­mited to what­ever ac­tion we end up tak­ing. After all the uni­verse is de­ter­minis­tic, so from the start of time there was only one pos­si­ble ac­tion we could have taken. So we can one-box and know we’ll get the mil­lion if the pre­dic­tor is perfect.

How­ever there are other prob­lems where the benefit ac­crues to a coun­ter­fac­tual self in­stead of to us di­rectly such as in Coun­ter­fac­tual Mug­ging. This is dis­cussed in Abram Dem­ski’s post on all-up­side and mixed-up­side up­date­less­ness. It’s the later type that is trou­ble­some.

I posted a ques­tion about this a few days ago:

If you are be­ing asked for $100, you know that the coin came up heads and you won’t re­ceive the $10000. Sure this means that if the coin would have been heads then you wouldn’t have gained the $10000, but you know the coin wasn’t heads so you don’t lose any­thing. It’s im­por­tant to em­pha­sise: this doesn’t deny that if the coin had come up heads that this would have made you miss out on $10000. In­stead, it claims that this point is ir­rele­vant, so merely re­peat­ing the point again isn’t a valid counter-ar­gu­ment.

A solution

In that post I cover many of the ar­gu­ments for pay­ing the coun­ter­fac­tual mug­ger and ar­gue that they don’t solve it. How­ever, af­ter post­ing, both Cousin_it and I in­de­pen­dently dis­cov­ered a thought ex­per­i­ment that is very per­sua­sive (in favour of pay­ing). The setup is as fol­lows:

Omega, a perfect pre­dic­tor, flips a coin and tell you how it came up. If if comes up heads, Omega asks you for $100, then pays you $10,000 if it pre­dict you would have paid if it had come up tails. If it comes up tails, Omega asks you for $100, then pays you $10,000 if it pre­dicts you would have paid if it had come up heads. In this case it was heads.

An up­date­less agent will get $9900 re­gard­less of which way the coin comes up, while an up­date­ful agent will get noth­ing. Note that even though you are play­ing against your­self, it is a coun­ter­fac­tual ver­sion of you that sees a differ­ent ob­ser­va­tion, so its ac­tion isn’t log­i­cally tied to yours. Like a nor­mal pris­oner’s dilemma, it would be pos­si­ble for heads-you to co-op­er­ate and tails-you to defect. So un­like play­ing pris­oner’s dilemma against a clone where you have a self­ish rea­son to co-op­er­ate, if coun­ter­fac­tual-you de­cides to be self­ish, there is no way to per­suade it to co-op­er­ate, that is, un­less you con­sider poli­cies as a whole and not in­di­vi­d­ual ac­tions. The les­son I take from this is that poli­cies are what we should be eval­u­at­ing, not in­di­vi­d­ual ac­tions.

Are there any al­ter­na­tives?

I find it hard to imag­ine an in­ter­me­di­ate po­si­tion that saves the idea of in­di­vi­d­ual ac­tions be­ing the lo­cus of eval­u­a­tion. For ex­am­ple, I’d be du­bi­ous about claims that the lo­cus of eval­u­a­tion should still be in­di­vi­d­ual de­ci­sions, ex­cept when we have situ­a­tions like the pris­oner’s dilemma. I won’t pre­tend to have a solid ar­gu­ment, but that would just seem to be an un­prin­ci­pled fudge; like let’s just call the gap­ing hole an ex­cep­tion so we don’t have to deal with it; like let’s just glue two differ­ent kinds of ob­jects to­gether which re­ally aren’t al­ike at all.

What does this mean?

This greatly un­der­mines the up­date­ful view that you only care about your cur­rent coun­ter­fac­tual. Fur­ther, the shift to eval­u­at­ing poli­cies sug­gests an up­date­less per­spec­tive. For ex­am­ple, it doesn’t seem to make sense to de­cide what you should have done if the coin had come up heads af­ter you see it come up tails. If you’ve made your de­ci­sion based on the coin, it’s too late for your de­ci­sion to af­fect the pre­dic­tion. And once you’ve com­mit­ted to the up­date­less per­spec­tive, the sym­me­try of the coin flip makes pay­ing the mug­ger the nat­u­ral choice, as­sum­ing you have a rea­son­able risk prefer­ence.

One other ad­van­tage is that in the origi­nal scenario