The A=a notation always bugged me too.
I like the above notation because it betrays morphism composition.
If we consider random variables as measure(able) spaces and conditional probabilities P(B | A) as stochastic maps B → P(A), then every element ‘a’ of (a countably generated) A induces a point measure → A giving probability 1 to that event. This is the map named by do(a). But since we’re composing maps, not elements, we can use an element a unambiguously to mean its point measure. Then a series of measures separated by ‘,’ give the product measure.
In the above example, let a : A (implicitly, → A), a’ : B (implicitly, → B), M : B ~> C, Y : (A,C) ~> D, then Y(a,M(a’)) is a stochastic map ~> D given by composition
EDIT: How do I ascii art?
All of this is a fancy way of saying that “potential outcome” notation conveys exactly the right information to make probabilities behave nicely.
Yes, one of the reasons I am not very fond of subscript or superscript notation (that to be fair is very commonly used) is because it quickly becomes awkward to nest things, and I personally often end up nesting things many level deep. Parentheses is the only thing I found that works acceptably well.
If you think of interventions as a morphism, then it is indeed very natural to think in terms of arbitrary function composition, which leads one to the usual functional notation. The reason people in the causal inference community perhaps do not find this as natural as a mathematician would is because it is difficult to interpret things like Y(a,M(a’)) as idealized experiments we could actually perform. There is a strong custom in the community (a healthy one in my opinion, because it grounds the discussion) to only consider quantities which can be so interpreted. See also this:
The A=a notation always bugged me too. I like the above notation because it betrays morphism composition.
If we consider random variables as measure(able) spaces and conditional probabilities P(B | A) as stochastic maps B → P(A), then every element ‘a’ of (a countably generated) A induces a point measure → A giving probability 1 to that event. This is the map named by do(a). But since we’re composing maps, not elements, we can use an element a unambiguously to mean its point measure. Then a series of measures separated by ‘,’ give the product measure. In the above example, let a : A (implicitly, → A), a’ : B (implicitly, → B), M : B ~> C, Y : (A,C) ~> D, then Y(a,M(a’)) is a stochastic map ~> D given by composition
EDIT: How do I ascii art?
All of this is a fancy way of saying that “potential outcome” notation conveys exactly the right information to make probabilities behave nicely.
Yes, one of the reasons I am not very fond of subscript or superscript notation (that to be fair is very commonly used) is because it quickly becomes awkward to nest things, and I personally often end up nesting things many level deep. Parentheses is the only thing I found that works acceptably well.
If you think of interventions as a morphism, then it is indeed very natural to think in terms of arbitrary function composition, which leads one to the usual functional notation. The reason people in the causal inference community perhaps do not find this as natural as a mathematician would is because it is difficult to interpret things like Y(a,M(a’)) as idealized experiments we could actually perform. There is a strong custom in the community (a healthy one in my opinion, because it grounds the discussion) to only consider quantities which can be so interpreted. See also this:
http://imai.princeton.edu/research/Design.html