Does “extremal counterfactual mugging” where $100 is replaced by self-destruction of agent and $10000 is replaced by creation of 100 agent’s copies (outside of agent’s light cone) requires same answer as counterfactual mugging?
I don’t believe it’s particularly controversial. There is a question of whether humans retain preference about counterfactual worlds, but decision-theoretically, not-updating in the usual sense is strictly superior, because you get to make decisions you otherwise wouldn’t be able to.
Okay, then let me try to trace back to the point where we disagree.
As I understand it:
1) Timeless Decision Theory tells you what to do, given your beliefs. Any belief updating would happen before you apply TDT then, so I don’t understand how TDT would err in terms of doing a Bayesian update (and that update is wrong to do) -- the error is independent of TDT, as TDT (see below) shields your actions from making losing decisions on the basis of such a “bad” update.
2) TDT can be stated as, “When calculating EU for the outcomes of an action, you must instead weight each outcome’s utility by the probability that this action would have led to it if your decision procedure were such that it outputs this action.”
So, on counterfactual mugging, even if it reasons that it’s not in the winning world, it reasons that its decision theory leads to the highest (TDT-calculated) EU by setting its action to a policy of paying out on losing, as then it can add the utility of the winning side into its EU.
Or does EY agree that TDT fails on CM? (I couldn’t tell from the CM article.)
3) Edit: And even if this is a case of Bayesian updating failing, does that generalize to dropping it altogether?
Bayesian updating is the wrong thing to do in counterfactual mugging, and the reason TDT goes wrong on that problem is that it updates.
Does “extremal counterfactual mugging” where $100 is replaced by self-destruction of agent and $10000 is replaced by creation of 100 agent’s copies (outside of agent’s light cone) requires same answer as counterfactual mugging?
And this is an uncontroversial view here, which one can safely assert as a premise, as Wei_Dai did here?
I don’t believe it’s particularly controversial. There is a question of whether humans retain preference about counterfactual worlds, but decision-theoretically, not-updating in the usual sense is strictly superior, because you get to make decisions you otherwise wouldn’t be able to.
Okay, then let me try to trace back to the point where we disagree.
As I understand it:
1) Timeless Decision Theory tells you what to do, given your beliefs. Any belief updating would happen before you apply TDT then, so I don’t understand how TDT would err in terms of doing a Bayesian update (and that update is wrong to do) -- the error is independent of TDT, as TDT (see below) shields your actions from making losing decisions on the basis of such a “bad” update.
2) TDT can be stated as, “When calculating EU for the outcomes of an action, you must instead weight each outcome’s utility by the probability that this action would have led to it if your decision procedure were such that it outputs this action.”
So, on counterfactual mugging, even if it reasons that it’s not in the winning world, it reasons that its decision theory leads to the highest (TDT-calculated) EU by setting its action to a policy of paying out on losing, as then it can add the utility of the winning side into its EU.
Or does EY agree that TDT fails on CM? (I couldn’t tell from the CM article.)
3) Edit: And even if this is a case of Bayesian updating failing, does that generalize to dropping it altogether?