(Sorry again for being slow to reply to this one.)
“Note that in non-Newcomb-like situations, P(s|do(a)) and P(s|a) yield the same result, see ch. 3.2.2 of Pearl’s Causality.”
This is trivially not true.
Is this because I define “Newcomb-ness” via disagreement about the best action between EDT and CDT in the second paragraph? Of course, the d(P(s|do(a)),P(s|a)) could be so small that EDT and CDT agree on what action to take. They could even differ in such a way that CDT-EV and EDT-EV are the same.
But it seems that instead of comparing the argmaxes or the EVs, one could also use the term Newcomb-ness on the basis of the probabilities themselves. Or is there some deeper reason why the sentence is false?
(a) p(s | do(a)) is in general not equal to p(s | a). The entire point of causal inference is characterizing that difference.
(b) I looked at section 3.2.2, did not see how anything there supporting the claim.
(c) We knew since the 90s that p(s | do(a)) and p(s | a) disagree on classical decision theory problems, standard smoking lesion being one. But in general on any problem where you shouldn’t “manage the news.”
So I got super confused and stopped reading.
As cousin_it said somewhere at some point (and I say in my youtube talk), the confusing part of Newcomb is representing the situation correctly, and that is something you can solve by playing with graphs, essentially.
So, the class of situations in which p(s | do(a)) = p(s | a) that I was alluding to is the one in which A has only outgoing arrows (or all the values of A’s predecessors are known). (I guess this could be generalized to: p(s | do(a)) = p(s | a) if A d-separates its predecessors from S?) (Presumably this stuff follows from Rule 2 of Theorem 3.4.1 in Causality.)
All problems in which you intervene in an isolated system from the outside are of this kind and so EDT and CDT make the same recommendations for intervening in a system from the outside. (That’s similar to the point that Pearl makes in section 3.2.2 of Causality: You can model the do-interventions by adding action nodes without predecessors and conditioning on these action nodes.)
The Smoking lesion is an example of a Newcomb-like problem where A has an inbound arrow that leads p(s | do(a)) and p(s | a) to differ. (That said, I think the smoking lesion does not actually work as a Newcomb-like problem, see e.g. chapter 4 of Arif Ahmed’s Evidence, Decision and Causality.)
Similarly, you could model Newcomb’s problem by introducing a logical node as a predecessor of your decision and the result of the prediction. (If you locate “yourself” in the logical node and the logical node does not have any predecessors, then CDT and EDT agree again.)
Of course, in the real world, all problems are in theory Newcomb-like because there are always some ingoing arrows into your decision. But in practice, most problems are nearly non-Newomb-like because, although there may be an unblocked path from my action to the value of my utility function, that path is usually too long/complicated to be useful. E.g., if I raise my hand now, that would mean that the state of the world 1 year ago was such that I raise my hand now. And the world state 1 year ago causes how much utility I have. But unless I’m in Arif Ahmed’s “Betting on the Past”, I don’t know which class of world states 1 year ago (the ones that lead to me raising my hand or the ones that cause me not to raise my hand) causes me to have more utility. So, EDT couldn’t try to exploit that way of changing the past.
I agree that in situations where A only has outgoing arrows, p(s | do(a)) = p(s | a), but this class of situations is not the “Newcomb-like” situations. In particular, classical smoking lesion has a confounder with an incoming arrow into a.
Maybe we just disagree on what “Newcomb-like” means? To me what makes a situation “Newcomb-like” is your decision algorithm influencing the world through something other than your decision (as happens in the Newcomb problem via Omega’s prediction). In smoking lesion, this does not happen, your decision algorithm only influences the world via your action, so it’s not “Newcomb-like” to me.
I agree that in situations where A only has outgoing arrows, p(s | do(a)) = p(s | a), but this class of situations is not the “Newcomb-like” situations.
What I meant to say is that the situations where A only has outgoing arrows are all not Newcomb-like.
Maybe we just disagree on what “Newcomb-like” means? To me what makes a situation “Newcomb-like” is your decision algorithm influencing the world through something other than your decision (as happens in the Newcomb problem via Omega’s prediction). In smoking lesion, this does not happen, your decision algorithm only influences the world via your action, so it’s not “Newcomb-like” to me.
Ah, okay. Yes, in that case, it seems to be only a terminological dispute. As I say in the post, I would define Newcomb-like-ness via a disagreement between EDT and CDT which can mean either that they disagree about what the right decision is, or, more naturally, that their probabilities diverge. (In the latter case, the statement you commented on is true by definition and in the former case it is false for the reason I mentioned in my first reply.) So, I would view the Smoking lesion as a Newcomb-like problem (ignoring the tickle defense).
(Sorry again for being slow to reply to this one.)
Is this because I define “Newcomb-ness” via disagreement about the best action between EDT and CDT in the second paragraph? Of course, the d(P(s|do(a)),P(s|a)) could be so small that EDT and CDT agree on what action to take. They could even differ in such a way that CDT-EV and EDT-EV are the same.
But it seems that instead of comparing the argmaxes or the EVs, one could also use the term Newcomb-ness on the basis of the probabilities themselves. Or is there some deeper reason why the sentence is false?
I guess:
(a) p(s | do(a)) is in general not equal to p(s | a). The entire point of causal inference is characterizing that difference.
(b) I looked at section 3.2.2, did not see how anything there supporting the claim.
(c) We knew since the 90s that p(s | do(a)) and p(s | a) disagree on classical decision theory problems, standard smoking lesion being one. But in general on any problem where you shouldn’t “manage the news.”
So I got super confused and stopped reading.
As cousin_it said somewhere at some point (and I say in my youtube talk), the confusing part of Newcomb is representing the situation correctly, and that is something you can solve by playing with graphs, essentially.
So, the class of situations in which p(s | do(a)) = p(s | a) that I was alluding to is the one in which A has only outgoing arrows (or all the values of A’s predecessors are known). (I guess this could be generalized to: p(s | do(a)) = p(s | a) if A d-separates its predecessors from S?) (Presumably this stuff follows from Rule 2 of Theorem 3.4.1 in Causality.)
All problems in which you intervene in an isolated system from the outside are of this kind and so EDT and CDT make the same recommendations for intervening in a system from the outside. (That’s similar to the point that Pearl makes in section 3.2.2 of Causality: You can model the do-interventions by adding action nodes without predecessors and conditioning on these action nodes.)
The Smoking lesion is an example of a Newcomb-like problem where A has an inbound arrow that leads p(s | do(a)) and p(s | a) to differ. (That said, I think the smoking lesion does not actually work as a Newcomb-like problem, see e.g. chapter 4 of Arif Ahmed’s Evidence, Decision and Causality.)
Similarly, you could model Newcomb’s problem by introducing a logical node as a predecessor of your decision and the result of the prediction. (If you locate “yourself” in the logical node and the logical node does not have any predecessors, then CDT and EDT agree again.)
Of course, in the real world, all problems are in theory Newcomb-like because there are always some ingoing arrows into your decision. But in practice, most problems are nearly non-Newomb-like because, although there may be an unblocked path from my action to the value of my utility function, that path is usually too long/complicated to be useful. E.g., if I raise my hand now, that would mean that the state of the world 1 year ago was such that I raise my hand now. And the world state 1 year ago causes how much utility I have. But unless I’m in Arif Ahmed’s “Betting on the Past”, I don’t know which class of world states 1 year ago (the ones that lead to me raising my hand or the ones that cause me not to raise my hand) causes me to have more utility. So, EDT couldn’t try to exploit that way of changing the past.
I agree that in situations where A only has outgoing arrows, p(s | do(a)) = p(s | a), but this class of situations is not the “Newcomb-like” situations. In particular, classical smoking lesion has a confounder with an incoming arrow into a.
Maybe we just disagree on what “Newcomb-like” means? To me what makes a situation “Newcomb-like” is your decision algorithm influencing the world through something other than your decision (as happens in the Newcomb problem via Omega’s prediction). In smoking lesion, this does not happen, your decision algorithm only influences the world via your action, so it’s not “Newcomb-like” to me.
What I meant to say is that the situations where A only has outgoing arrows are all not Newcomb-like.
Ah, okay. Yes, in that case, it seems to be only a terminological dispute. As I say in the post, I would define Newcomb-like-ness via a disagreement between EDT and CDT which can mean either that they disagree about what the right decision is, or, more naturally, that their probabilities diverge. (In the latter case, the statement you commented on is true by definition and in the former case it is false for the reason I mentioned in my first reply.) So, I would view the Smoking lesion as a Newcomb-like problem (ignoring the tickle defense).