I like this post a lot. But I wanted to second Morpheus’ point that it’s not actually quite correct.
Instead of this:
// ODDS = YEP:NOPE
YEP, NOPE = MAKE UP SOME INITIAL ODDS WHO CARES
FOR EACH E IN EVIDENCE
YEP *= CHANCE OF E IF YEP
NOPE *= CHANCE OF E IF NOPE
It should be:
// ODDS = YEP:NOPE
YEP, NOPE = MAKE UP SOME INITIAL ODDS WHO CARES
FOR EACH E IN EVIDENCE
YEP *= CHANCE OF E IF {YEP & ALL PREVIOUSLY SEEN E}
NOPE *= CHANCE OF E IF {NOPE & ALL PREVIOUSLY SEEN E}
The recipe in the post is only true assuming that all the evidence is independent given YEP or NOPE. But that’s rarely true, and (in my view) the most common mistake that people actually make when trying to apply Bayesian reasoning in practice, and leads to the kinds of crazy over-confident posteriors we see in certain things like the Rootclaim Covid-19 debate.
I reiterate that I like this post a lot! The point of the post isn’t to make some novel mathematical contribution but to explain things in a vivid way. I think it succeeds there and with higher “production values” I think this post might have the potential to be wildly influential.
I guess that depends on what “ASSUME E” means. It’s correct if interpreted the right way, but I think it’s pretty ambiguous compared to the rest of the code, so personally I don’t love it. I also think it’s a bit confusing to use “ASSUME E” and also use “IF YEP” when those are both conditioning operations. Maybe it would be slightly better if you wrote something like
// ODDS = YEP:NOPE
YEP, NOPE = MAKE UP SOME INITIAL ODDS WHO CARES
FOR EACH E IN EVIDENCE
YEP *= CHANCE OF E IF YEP AND ASSUMPTIONS
NOPE *= CHANCE OF E IF NOPE AND ASSUMPTIONS
ASSUME E // DO NOT DOUBLE COUNT
Or if you’re OK using a little set notation:
// ODDS = YEP:NOPE
YEP, NOPE = MAKE UP SOME INITIAL ODDS WHO CARES
SEEN = {}
FOR EACH E IN EVIDENCE
YEP *= CHANCE OF E IF {YEP} ∪ SEEN
NOPE *= CHANCE OF E IF {NOPE} ∪ SEEN
SEEN = SEEN ∪ {E} // DO NOT DOUBLE COUNT
Like my original proposal of YEP *= CHANCE OF E IF {YEP & ALL PREVIOUSLY SEEN E} these are both (to me) unambiguous. But they’re kind of clunky. I’m not sure if there is a good solution!
Please no set notation! The arrow brackets are on thin ice I think.
I meant what I said above, I think there’s something really good about having a Bayes explanation that requires no symbols not on a standard keyboard and no math an on-track fourth grader wouldn’t know.
(And also, thank you both for improving this! I recognize you two are the ones in the arena at the moment and I wish I was able to help refine this more.)
Another proposal! (I think I’m too close to this to really judge what’s best.)
// ODDS = YEP:NOPE
YEP = MAKE SOMETHING UP WHATEVER LOL
ASSUME YEP
FOR EACH E IN EVIDENCE
YEP *= HOW SURPRISING IS E?
ASSUME E
THROW ASSUMPTIONS IN TRASH
NOPE = MAKE SOMETHING UP WHATEVER LOL
ASSUME NOPE
FOR EACH E IN EVIDENCE
YEP *= HOW SURPRISING IS E?
ASSUME E
That is a clever way to succinctly say it. However, I worry that I only understood that because I already was aware of the concept. Perhaps I should show this to some smart friends with basic math chops who don’t already know about the whole naive bayes thing,
I like this post a lot. But I wanted to second Morpheus’ point that it’s not actually quite correct.
Instead of this:
It should be:
This is true because
The recipe in the post is only true assuming that all the evidence is independent given YEP or NOPE. But that’s rarely true, and (in my view) the most common mistake that people actually make when trying to apply Bayesian reasoning in practice, and leads to the kinds of crazy over-confident posteriors we see in certain things like the Rootclaim Covid-19 debate.
I reiterate that I like this post a lot! The point of the post isn’t to make some novel mathematical contribution but to explain things in a vivid way. I think it succeeds there and with higher “production values” I think this post might have the potential to be wildly influential.
I just added the
ASSUME Eline at the end. Is the post code correct now?I guess that depends on what “ASSUME E” means. It’s correct if interpreted the right way, but I think it’s pretty ambiguous compared to the rest of the code, so personally I don’t love it. I also think it’s a bit confusing to use “ASSUME E” and also use “IF YEP” when those are both conditioning operations. Maybe it would be slightly better if you wrote something like
Or if you’re OK using a little set notation:
Like my original proposal of YEP *= CHANCE OF E IF {YEP & ALL PREVIOUSLY SEEN E} these are both (to me) unambiguous. But they’re kind of clunky. I’m not sure if there is a good solution!
Please no set notation! The arrow brackets are on thin ice I think.
I meant what I said above, I think there’s something really good about having a Bayes explanation that requires no symbols not on a standard keyboard and no math an on-track fourth grader wouldn’t know.
(And also, thank you both for improving this! I recognize you two are the ones in the arena at the moment and I wish I was able to help refine this more.)
Another proposal! (I think I’m too close to this to really judge what’s best.)
That is a clever way to succinctly say it. However, I worry that I only understood that because I already was aware of the concept. Perhaps I should show this to some smart friends with basic math chops who don’t already know about the whole naive bayes thing,
You fixed it!! Thank you!!