The first intuition is that the counterfactual involves changes the physical result of your decision making, not the process of your decision making itself. The second intuition is that the counterfactual involves a replacement of the process of your decision making such that you’d take another action than you would normally do.
I imagine it as the following:
Physical intervention: I imagine that I’m possessed by a demon that leads me to take the physical actions to choose another option than I would have voluntarily.
Logical intervention: I imagine that I was a different person with a different life history, that would have led me to choose a different path than the me in physical reality would choose.
This doesn’t quite communicate how loopy logical intervention can feel, however: I usually imagine logical alternative futures as ones where you effectively have 2+2=3 or something equally clearly illogical as a part of the bedrock of the universe.
I don’t think that different problems lead one to develop different intuitions. I think that physical intervention is the more intuitive way people relate to counterfactuals, including for mundane decision theory problems like Newcomb’s problem, and that logical intervention is something people need clarifying thought experiments to get used to. I found Counterlogical Mugging (which is counterfactual mugging but involves a statement you have logical uncertainty over) as a very useful intuition pump to start thinking in terms of logical intervention as a counterfactual.
But in the Twin Prisoner’s Dilemma, one
might interpret the policy node in two different ways, and the interpretation will
affect the causal structure. We could interpret intervening on your policy ˜D as
changing the physical result of the compilation of your source code, such that
an intervention will only affect your decision D, and not that of your twin T .
Under this physical notion of causality, we get fig. 3a, where there is a common
cause S explaining the correlation between the agent’s policy and its twin’s.
But on the other hand, if we think of intervening on your policy as changing
the way your source code compiles in all cases, then intervening on it will affect
your opponent’s policy, which is compiled from the same code. In this case, we
get the structure shown in fig. 3b, where an intervention on my policy would
affect my twin’s policy. We can view this as an intervention on an abstract
“logical” variable rather than an ordinary physical variable. We therefore call
the resulting model a logical-causal model.
Pearl’s notion of causality is the physical one, but Pearl-style graphs have
also been used in the decision theory literature to represent logical causality.
One purpose of this paper is to show that mechanism variables are a
useful addition to any graphical model being used in decision theory.
The first intuition is that the counterfactual involves changes the physical result of your decision making, not the process of your decision making itself. The second intuition is that the counterfactual involves a replacement of the process of your decision making such that you’d take another action than you would normally do.
Hm, this makes me realize I’m not fully sure what’s meant by “counterfactual” here.
I normally thinking of it as, like. I’m looking at a world history, e.g. with variables A and B and times t=0,1,2 and some relationships between them. And I say “at t=1, A took value a. What if at t=1, I had taken value a′ instead? What would that change at t=2?” It’s clear how to fit decisions I’ve made in the past into that framework.
Or I can run it forwards, looking from t=0 to t=1,2, imagining what’s going to happen by default, and imagining what happens if I make a change at some point. It’s less clear how to fit my own decisions into this framework, because what does “by default” mean then? But I can just pick some decision to plug in at every point where I get to make one, and say that all of these picks give me a counterfactual. (And perhaps by extension, if there are no decision points, I should also consider the imagined “what’s going to happen by default” world to be a counterfactual.)
But if the discussion of counterfactuals starts by talking about decisions I’ve made, or are going to make, then it’s not clear to me whether it can be extended to talk about general interventions on world histories.
I think that the first intuition corresponds to “interventions on causal models using the do operator”. That’s something I don’t think I understand deeply, but I do think I get the basics of, like, “what is this field of study trying to do, what questions is it asking, what sorts of objects does it work with and how do we manipulate them”. (E.g. if this is what we’re doing, then we say “we’re allowed to just set A=a′ at t=1, we don’t need to go back to t=0 and figure out how that state of affairs could have come about”.)
Does the second intuition correspond to something that we can talk about without talking about my decisions? (And if so, is it a different thing than the first intuition? Or is it, like, they both naturally extend to a world with no decision points for me, but the way they extend to that is the same in those worlds, and so they only differ in worlds that do have decision points for me?)
Thank you! That’s definitely more clear than anything I’ve read about this on LW to date!
Follow-up question that immediately occurs to me:
Why are these two ways of evaluating counterfactuals and not, like… “answers to two different questions”? What I mean is: if we want to know what would happen in a “counterfactual” case, it seems like the first thing to do is to say “now, by that do you mean to ask what would happen under physical intervention, or what would happen under logical intervention?” Right? Those would (could?) have different answers, and really do seem like different questions, so after realizing that they’re different questions, have we thereby resolved all confusions about “counterfactuals”? Or do some puzzles remain?
What I mean is: if we want to know what would happen in a “counterfactual” case, it seems like the first thing to do is to say “now, by that do you mean to ask what would happen under physical intervention, or what would happen under logical intervention?” Right?
Yes.
Those would (could?) have different answers, and really do seem like different questions, so after realizing that they’re different questions, have we thereby resolved all confusions about “counterfactuals”?
I think that intervening on causality and logic are the only two ways one could intervene to create an outcome different from the one that actually occurs.
Or do some puzzles remain?
I don’t work in the decision theory field, so I want someone else to answer this question.
The first intuition is that the counterfactual involves changes the physical result of your decision making, not the process of your decision making itself. The second intuition is that the counterfactual involves a replacement of the process of your decision making such that you’d take another action than you would normally do.
I imagine it as the following:
Physical intervention: I imagine that I’m possessed by a demon that leads me to take the physical actions to choose another option than I would have voluntarily.
Logical intervention: I imagine that I was a different person with a different life history, that would have led me to choose a different path than the me in physical reality would choose. This doesn’t quite communicate how loopy logical intervention can feel, however: I usually imagine logical alternative futures as ones where you effectively have 2+2=3 or something equally clearly illogical as a part of the bedrock of the universe.
I don’t think that different problems lead one to develop different intuitions. I think that physical intervention is the more intuitive way people relate to counterfactuals, including for mundane decision theory problems like Newcomb’s problem, and that logical intervention is something people need clarifying thought experiments to get used to. I found Counterlogical Mugging (which is counterfactual mugging but involves a statement you have logical uncertainty over) as a very useful intuition pump to start thinking in terms of logical intervention as a counterfactual.
For a more rigorous explanation, here’s the relevant section from MacDermott et al., “Characterising Decision Theories with Mechanised Causal Graphs”:
Hm, this makes me realize I’m not fully sure what’s meant by “counterfactual” here.
I normally thinking of it as, like. I’m looking at a world history, e.g. with variables A and B and times t=0,1,2 and some relationships between them. And I say “at t=1, A took value a. What if at t=1, I had taken value a′ instead? What would that change at t=2?” It’s clear how to fit decisions I’ve made in the past into that framework.
Or I can run it forwards, looking from t=0 to t=1,2, imagining what’s going to happen by default, and imagining what happens if I make a change at some point. It’s less clear how to fit my own decisions into this framework, because what does “by default” mean then? But I can just pick some decision to plug in at every point where I get to make one, and say that all of these picks give me a counterfactual. (And perhaps by extension, if there are no decision points, I should also consider the imagined “what’s going to happen by default” world to be a counterfactual.)
But if the discussion of counterfactuals starts by talking about decisions I’ve made, or are going to make, then it’s not clear to me whether it can be extended to talk about general interventions on world histories.
I think that the first intuition corresponds to “interventions on causal models using the do operator”. That’s something I don’t think I understand deeply, but I do think I get the basics of, like, “what is this field of study trying to do, what questions is it asking, what sorts of objects does it work with and how do we manipulate them”. (E.g. if this is what we’re doing, then we say “we’re allowed to just set A=a′ at t=1, we don’t need to go back to t=0 and figure out how that state of affairs could have come about”.)
Does the second intuition correspond to something that we can talk about without talking about my decisions? (And if so, is it a different thing than the first intuition? Or is it, like, they both naturally extend to a world with no decision points for me, but the way they extend to that is the same in those worlds, and so they only differ in worlds that do have decision points for me?)
Thank you! That’s definitely more clear than anything I’ve read about this on LW to date!
Follow-up question that immediately occurs to me:
Why are these two ways of evaluating counterfactuals and not, like… “answers to two different questions”? What I mean is: if we want to know what would happen in a “counterfactual” case, it seems like the first thing to do is to say “now, by that do you mean to ask what would happen under physical intervention, or what would happen under logical intervention?” Right? Those would (could?) have different answers, and really do seem like different questions, so after realizing that they’re different questions, have we thereby resolved all confusions about “counterfactuals”? Or do some puzzles remain?
Yes.
I think that intervening on causality and logic are the only two ways one could intervene to create an outcome different from the one that actually occurs.
I don’t work in the decision theory field, so I want someone else to answer this question.