A not-quite-rigorous explanation of the thing in 18.15:
E_aa is, by construction, only relevant to A. A_p was defined (in 18.1) to screen off all previous knowledge about A. So in fact, if we are given evidence E_aa but then given evidence A_p, then E_aa becomes completely irrelevant: it’s no longer telling us anything about A, but it never told us anything about anything else. Therefore P(F|A_p E_aa) can be simplified to P(F|A_p).
That’s not true though. By construction, every part of it is relevant to A.
That doesn’t mean it’s not relevant to anything else. For example, It could be in this Bayes net: E_aa ---> A ----> F. Then it’d be relevant to F.
Although… thinking about that Bayes net might answer other questons...
Hmm. Remember that Ap screens A from everything. I think that means that A’s only connection is to Ap—everything else has to be connected through Ap.
So the above Bayes net is really
Eaa --> Ap --> F
With another arrow from Ap to A.
Which would mean that Ap screens Eaa from F, which is what 18.15 says.
The above Bayes net represents an assumption that Eaa and F’s only relevance to each other is that they’re both evidence of A, which is often true I think.
Hmm. When I have some time I’m gonna draw Bayes nets to represent all of Jaynes’ assumptions in this chapter, and when something looks unjustified, figure out what Bayes net structure would justify it.
In fact, I skipped over this before but this is actually recommended in the comments of that errata page I posted:
p. 554, eqn. (18.1): This definition cannot hold true for arbitrary propositions $E$; for example, what if $E$ implies $A$? This kind of problem occurs throughout the chapter. I don’t think you can really discuss the $A_p$ distribution properly without explicitly introducing the notion of a sample space and organizing one’s information about the sample space as a graphical model in which $A$ has a single parent variable $\theta$, with $A_p$ defined as the proposition $\theta = p$. For those unfamiliar with graphical models / Bayesian networks, I recommend the following book:
Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference (J. Pearl).
A not-quite-rigorous explanation of the thing in 18.15:
E_aa is, by construction, only relevant to A. A_p was defined (in 18.1) to screen off all previous knowledge about A. So in fact, if we are given evidence E_aa but then given evidence A_p, then E_aa becomes completely irrelevant: it’s no longer telling us anything about A, but it never told us anything about anything else. Therefore P(F|A_p E_aa) can be simplified to P(F|A_p).
That’s not true though. By construction, every part of it is relevant to A.
That doesn’t mean it’s not relevant to anything else. For example, It could be in this Bayes net: E_aa ---> A ----> F. Then it’d be relevant to F.
Although… thinking about that Bayes net might answer other questons...
Hmm. Remember that Ap screens A from everything. I think that means that A’s only connection is to Ap—everything else has to be connected through Ap.
So the above Bayes net is really
Eaa --> Ap --> F With another arrow from Ap to A.
Which would mean that Ap screens Eaa from F, which is what 18.15 says.
The above Bayes net represents an assumption that Eaa and F’s only relevance to each other is that they’re both evidence of A, which is often true I think.
Hmm. When I have some time I’m gonna draw Bayes nets to represent all of Jaynes’ assumptions in this chapter, and when something looks unjustified, figure out what Bayes net structure would justify it.
In fact, I skipped over this before but this is actually recommended in the comments of that errata page I posted: