Has anyone here read up through ch18 of Jaynes’ PT:LoS? I just spent two hours trying to derive 18.11 from 18.10. That step is completely opaque to me, can anybody who’s read it help?
You can explain in a comment, or we can have a conversation. I’ve got gchat and other stuff. If you message me or comment we can work it out. I probably won’t take long to reply, I don’t think I’ll be leaving my computer for long today.
EDIT: I’m also having trouble with 18.15. Jaynes claims that P(F|A_p E_aa) = P(F|A_p) but justifies it with 18.1… I just don’t see how that follows from 18.1.
EDIT 2: It hasn’t answered my question but there’s online errata for this book: http://ksvanhorn.com/bayes/jaynes/ Chapter 18 has a very unfinished feel, and I think this is going to help other confusions I get into about it
Yeah, so to add some redundancy for y’all, here’s the text surrounding the equations I’m having trouble with.
The 18.10 to 18.11 jump I’m having trouble with is the one in this part of the text:
But suppose that, for a given E_b, (18.8) holds independently of what E_a might be; call this ‘strong irrelevance’. Then we have
(what I’m calling 18.10)
But if this is to hold for all (A_p|E_a), the integrands must be the same:
(what I’m calling 18.11, and can’t derive)
.
And equation 18.15, which I can’t justify, is in this part of the text:
But then, by definition (18.1) of A_p, we can see that A_p automatically cancels out E_aa in the numerator: (F|A_pE_aa)=(F|A_p). And so we have (18.13) reduced to
(what I’m calling 18.15, and don’t follow the justification for)
A not-quite-rigorous explanation of the thing in 18.15:
E_aa is, by construction, only relevant to A. A_p was defined (in 18.1) to screen off all previous knowledge about A. So in fact, if we are given evidence E_aa but then given evidence A_p, then E_aa becomes completely irrelevant: it’s no longer telling us anything about A, but it never told us anything about anything else. Therefore P(F|A_p E_aa) can be simplified to P(F|A_p).
That’s not true though. By construction, every part of it is relevant to A.
That doesn’t mean it’s not relevant to anything else. For example, It could be in this Bayes net: E_aa ---> A ----> F. Then it’d be relevant to F.
Although… thinking about that Bayes net might answer other questons...
Hmm. Remember that Ap screens A from everything. I think that means that A’s only connection is to Ap—everything else has to be connected through Ap.
So the above Bayes net is really
Eaa --> Ap --> F
With another arrow from Ap to A.
Which would mean that Ap screens Eaa from F, which is what 18.15 says.
The above Bayes net represents an assumption that Eaa and F’s only relevance to each other is that they’re both evidence of A, which is often true I think.
Hmm. When I have some time I’m gonna draw Bayes nets to represent all of Jaynes’ assumptions in this chapter, and when something looks unjustified, figure out what Bayes net structure would justify it.
In fact, I skipped over this before but this is actually recommended in the comments of that errata page I posted:
p. 554, eqn. (18.1): This definition cannot hold true for arbitrary propositions $E$; for example, what if $E$ implies $A$? This kind of problem occurs throughout the chapter. I don’t think you can really discuss the $A_p$ distribution properly without explicitly introducing the notion of a sample space and organizing one’s information about the sample space as a graphical model in which $A$ has a single parent variable $\theta$, with $A_p$ defined as the proposition $\theta = p$. For those unfamiliar with graphical models / Bayesian networks, I recommend the following book:
Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference (J. Pearl).
Has anyone here read up through ch18 of Jaynes’ PT:LoS? I just spent two hours trying to derive 18.11 from 18.10. That step is completely opaque to me, can anybody who’s read it help?
You can explain in a comment, or we can have a conversation. I’ve got gchat and other stuff. If you message me or comment we can work it out. I probably won’t take long to reply, I don’t think I’ll be leaving my computer for long today.
EDIT: I’m also having trouble with 18.15. Jaynes claims that P(F|A_p E_aa) = P(F|A_p) but justifies it with 18.1… I just don’t see how that follows from 18.1.
EDIT 2: It hasn’t answered my question but there’s online errata for this book: http://ksvanhorn.com/bayes/jaynes/ Chapter 18 has a very unfinished feel, and I think this is going to help other confusions I get into about it
I’ve just looked and I have no idea either. If anyone wants to help there’s a copy of the book here.
EDIT: The numbers in that copy are off by 1 from the book. “18.10” = “18-9″ and so on.
Yeah, so to add some redundancy for y’all, here’s the text surrounding the equations I’m having trouble with.
The 18.10 to 18.11 jump I’m having trouble with is the one in this part of the text:
And equation 18.15, which I can’t justify, is in this part of the text:
A not-quite-rigorous explanation of the thing in 18.15:
E_aa is, by construction, only relevant to A. A_p was defined (in 18.1) to screen off all previous knowledge about A. So in fact, if we are given evidence E_aa but then given evidence A_p, then E_aa becomes completely irrelevant: it’s no longer telling us anything about A, but it never told us anything about anything else. Therefore P(F|A_p E_aa) can be simplified to P(F|A_p).
That’s not true though. By construction, every part of it is relevant to A.
That doesn’t mean it’s not relevant to anything else. For example, It could be in this Bayes net: E_aa ---> A ----> F. Then it’d be relevant to F.
Although… thinking about that Bayes net might answer other questons...
Hmm. Remember that Ap screens A from everything. I think that means that A’s only connection is to Ap—everything else has to be connected through Ap.
So the above Bayes net is really
Eaa --> Ap --> F With another arrow from Ap to A.
Which would mean that Ap screens Eaa from F, which is what 18.15 says.
The above Bayes net represents an assumption that Eaa and F’s only relevance to each other is that they’re both evidence of A, which is often true I think.
Hmm. When I have some time I’m gonna draw Bayes nets to represent all of Jaynes’ assumptions in this chapter, and when something looks unjustified, figure out what Bayes net structure would justify it.
In fact, I skipped over this before but this is actually recommended in the comments of that errata page I posted:
Yeah, I don’t see how those make sense either.