I’m trying to understand the second clause for conditional histories better.
The first clause is very intuitive, and in some sense, exactly what I would expect. I understand it as basically saying that h(X|E) drops elements from h(X) which can be inferred from E. Makes a kind of sense!
However, if that were the end of the story, then conditional histories would obviously be the wrong tool for defining conditional orthogonality. Conditional orthogonality is supposed to tell us about conditional independence in the probability distribution. However, we know from causal graphs that conditioning can create dependence. EG, in the bayes net A→B←C, A and C are independent, but if we condition on C, A and B become dependent. Therefore, conditional histories need to grow somehow. The second clause in the definition can be seen as artificially adding things to the history in order to represent that A and C have lost their independence.
What I don’t yet see is how to relate these phenomena in detail. I find it surprising that the second clause only depends on E, not on X. It seems important to note that we are not simply adding the history of E[1] into the answer. Instead, it asks that the history of E itself ‴factors‴ into the part within h(X|E) and the part outside. If E and X are independent, then only the first clause comes into play. So the implications of the second clause do depend on X, even though the clause doesn’t mention X.
So, is there a nice way to see how the second clause adds an “artificial history” to capture the new dependencies which X might gain when we condition on E?
I’m trying to understand the second clause for conditional histories better.
The first clause is very intuitive, and in some sense, exactly what I would expect. I understand it as basically saying that h(X|E) drops elements from h(X) which can be inferred from E. Makes a kind of sense!
However, if that were the end of the story, then conditional histories would obviously be the wrong tool for defining conditional orthogonality. Conditional orthogonality is supposed to tell us about conditional independence in the probability distribution. However, we know from causal graphs that conditioning can create dependence. EG, in the bayes net A→B←C, A and C are independent, but if we condition on C, A and B become dependent. Therefore, conditional histories need to grow somehow. The second clause in the definition can be seen as artificially adding things to the history in order to represent that A and C have lost their independence.
What I don’t yet see is how to relate these phenomena in detail. I find it surprising that the second clause only depends on E, not on X. It seems important to note that we are not simply adding the history of E[1] into the answer. Instead, it asks that the history of E itself ‴factors‴ into the part within h(X|E) and the part outside. If E and X are independent, then only the first clause comes into play. So the implications of the second clause do depend on X, even though the clause doesn’t mention X.
So, is there a nice way to see how the second clause adds an “artificial history” to capture the new dependencies which X might gain when we condition on E?
@Scott Garrabrant
In this paragraph, I am conflating the set E⊆S with the partition {E,S−E}.