Coun­ter­fac­tual Mugging

TagLast edit: 13 Aug 2023 3:08 UTC by asher

Counterfactual mugging is a thought experiment for testing and differentiating decision theories, stated as follows:

Omega, a perfect predictor, flips a coin. If it comes up tails Omega asks you for $100. If it comes up heads, Omega pays you $10,000 if it predicts that you would have paid if it had come up tails.

Depending on how the problem is phrased, intuition calls for different answers. For example, Eliezer Yudkowsky has argued that framing the problem in a way Omega is a regular aspect of the environment which regularly asks such types of questions makes most people answer ‘Yes’. However, Vladimir Nesov points out that Rationalists Should Win could be interpreted as suggesting that we should not pay. After all, even though paying in the tails case would cause you to do worse in the counterfactual where the coin came up heads, you already know the counterfactual didn’t happen, so it’s not obvious that you should pay. This issue has been discussed in this question.

Formal decision theories also diverge. For Causal Decision Theory, you can only affect those probabilities that you are causally linked to. Hence, the answer should be ‘No’. In Evidential Decision Theory any kind of connection is accounted, then the answer should be ‘No’. Timeless Decision Theory answer seems undefined, however Yudkowsky has argued that if the problem is recurrently presented, one should answer ‘Yes’ on the basis of enhancing its probability of gaining $10000 in the next round. This seems to be Causal Decision Theory prescription as well. Updateless decision theory1 prescribes giving the $100, on the basis your decision can influence both the ‘heads branch’ and ‘tails branch’ of the universe.

Regardless of the particular decision theory, it is generally agreed that if you can pre-commit in advance that you should do so. The dispute is purely over what you should do if you didn’t pre-commit.

Eliezer listed this in his 2009 post Timeless Decision Theory Problems I can’t Solve, although that was written before Updateless Decision Theory.


The Counterfactual Prisoner’s Dilemma is a symmetric variant of he original independently suggested by Chris Leong and Cousin_it:

Omega, a perfect predictor, flips a coin. If if comes up heads, Omega asks you for $100, then pays you $10,000 if it predict you would have paid if it had come up tails and you were told it was tails. If it comes up tails, Omega asks you for $100, then pays you $10,000 if it predicts you would have paid if it had come up heads and you were told it was heads

In this scenario, an updateless agent receives $9900 and an updateful agent receives nothing regardless of the coin flip, while in the original scenario the upateless agent only comes out ahead if the coin shows tails. This is claimed as a demonstration of the principle that when evaluating decisions we should consider the counterfactual and not just our particular branch of possibility space.

In Logical Counterfactual Mugging instead of flipping a coin, Omega tells you the 10,000th digit of pi, which we assume you don’t know off the top of your head. If it is odd, we treat it like heads in the original problem and if it is even treat it like tails. Logical inductors have been proposed as a solution to this problem. Applying this to Logical Counterfactual Mugging.

The Counterfactual Mugging Poker Game is a somewhat complicated variant by Scott Garrabrant. Player A receives a single card that is either high or low, which they can then reveal if they so desire. Player B then shares their true probability estimate that player A has a high card. Player B is essentially perfect at predicting your behaviour, but doesn’t get to see you after you’ve drawn the card. Additionally, player A loses dollars. If you show the card if it is low, then you lose 0. However, since B can predict your behaviour, this means that if the card had been high then player B would be able to guess that you had a high card even if you hadn’t revealed it. This would lose you a whole dollar and on average you’d be better if you always showed it. Garrabrant states that he prefers this scenario because Counterfactual Mugging feels like it is trying to trick you, while in this scenario you are the one creating the Counterfactual Mugging like situation to withhold information.

Comparison to Other Problem

In Two Types of Updatelessness, makes a distinction between all-upside updatelessness and mixed-upside updatelessness. In all-upside case, utilising an updateless decision theory provides a better result in the current situation, while in a mixed-upside case the benefits go to other possible selves. Unlike Newcomb’s Problem or Parfit’s Hitchhiker, Counterfactual Mugging is a mixed-upside case.

Blog posts

External links

See also

Coun­ter­fac­tual Mugging

Vladimir_Nesov19 Mar 2009 6:08 UTC
80 points
296 comments2 min readLW link

Coun­ter­fac­tual Mug­ging and Log­i­cal Uncertainty

Vladimir_Nesov5 Sep 2009 22:31 UTC
11 points
21 comments3 min readLW link

Coun­ter­fac­tual Mug­ging Poker Game

Scott Garrabrant13 Jun 2018 23:34 UTC
111 points
3 comments1 min readLW link

[Question] Coun­ter­fac­tual Mug­ging: Why should you pay?

Chris_Leong17 Dec 2019 22:16 UTC
6 points
59 comments3 min readLW link

Ex­tremely Coun­ter­fac­tual Mug­ging or: the gist of Trans­par­ent Newcomb

Bongo9 Feb 2011 15:20 UTC
10 points
79 comments1 min readLW link

The sin of up­dat­ing when you can change whether you exist

Benya28 Feb 2014 1:25 UTC
17 points
17 comments10 min readLW link

Coun­ter­fac­tual mug­ging: alien ab­duc­tion edition

Emile28 Sep 2010 21:25 UTC
4 points
18 comments1 min readLW link

The Coun­ter­fac­tual Pri­soner’s Dilemma

Chris_Leong21 Dec 2019 1:44 UTC
21 points
17 comments3 min readLW link

Time­less De­ci­sion The­ory: Prob­lems I Can’t Solve

Eliezer Yudkowsky20 Jul 2009 0:02 UTC
56 points
156 comments6 min readLW link

Haz­ing as Coun­ter­fac­tual Mug­ging?

SilasBarta11 Oct 2010 14:17 UTC
5 points
8 comments1 min readLW link

AXRP Epi­sode 5 - In­fra-Bayesi­anism with Vanessa Kosoy

DanielFilan10 Mar 2021 4:30 UTC
35 points
12 comments36 min readLW link

Precom­mit­ting to pay­ing Omega.

topynate20 Mar 2009 4:33 UTC
5 points
33 comments7 min readLW link

UDT might not pay a Coun­ter­fac­tual Mugger

winwonce21 Nov 2020 23:27 UTC
5 points
18 comments2 min readLW link

Ma­chine learn­ing could be fun­da­men­tally unexplainable

George3d616 Dec 2020 13:32 UTC
26 points
15 comments15 min readLW link

Up­date­less­ness doesn’t solve most problems

Martín Soto8 Feb 2024 17:30 UTC
124 points
43 comments12 min readLW link

Naive TDT, Bayes nets, and coun­ter­fac­tual mugging

Stuart_Armstrong23 Oct 2012 15:58 UTC
26 points
39 comments3 min readLW link

Ap­ply­ing the Coun­ter­fac­tual Pri­soner’s Dilemma to Log­i­cal Uncertainty

Chris_Leong16 Sep 2020 10:34 UTC
9 points
5 comments2 min readLW link

Log­i­cal Line-Of-Sight Makes Games Se­quen­tial or Loopy

StrivingForLegibility19 Jan 2024 4:05 UTC
38 points
0 comments7 min readLW link

Disen­tan­gling four mo­ti­va­tions for act­ing in ac­cor­dance with UDT

Julian Stastny5 Nov 2023 21:26 UTC
33 points
3 comments7 min readLW link