It’s not the programming notation that makes it work for me (though that helps a little.) It’s not the particular example either, though I do think it’s a bit better than the abstract mammogram example. There’s just way fewer numbers.
It’s because the notation on each line contains two numbers, both of which are. . . primitives? atomic pieces? I can do them in one step. (My inner monologue goes something like “3:2 means the the first thing happens three times for every two times the second thing happens, over the long run anyway. There’s five balls in the bag, three of the first colour and two of the second. Now more balls than that, but keep the ratio.”)
And then if I want to do an update, I just need four numbers, each of which makes sense on their own, each of which is used in one place. 1:100, 20:50, multiply the left by the left (20) and the right by the right (5000) and now I have two numbers again. (20:5000) I can usually simplify that in my head (2:500, okay now 1:250.) The line “The colon is a thick & rubbery barrier. Yep with yep and nope with nope” helps a lot, I’m reminded to keep all the yeps on the left and the nopes on the right. Because multiplication is transitive, I can just keep doing that at each new update, never dealing with more than four numbers. If I’d rather (or if I’m using pen and paper) I can just write out a dozen updates and get the products after.
Compare this sucker:
Four numbers used in six places. I’ll be the village idiot and admit I cannot reliably keep a phone number in my head without mental tricks. I have lost count of the number of times I have swapped P(A|B) and P(B|A) accidentally. The numbers aren’t arranged on the page in a way that helps my intuition, like yeps being on top and nopes on the bottom or something.
Or compare the explanation at the first link you shared.
Bayes’ rule in the odds form says that for every pair of hypotheses, their relative prior odds, times the relative likelihood of the evidence, equals the relative posterior odds.
Let H be a vector of hypotheses H1,H2,… Because Bayes’ rule holds between every pair of hypotheses in H, we can simply multiply an odds vector by a likelihood vector in order to get the correct posterior vector:
O(H)×Le(H)=O(H∣e)
where O(H) is the vector of relative prior odds between all the Hi, Le(H) is the vector of relative likelihoods with which each Hi predicted e, and O(H∣e) is the relative posterior odds between all the Hi.
In fact, we can keep multiplying by likelihood vectors to perform multiple updates at once:
O(H)×Le1(H)×Le2(H∧e1)×Le3(H∧e1∧e2)=O(H∣e1∧e2∧e3)
I am trying to express I find that more complicated. I don’t know what Le means. It took me a bit to remember what O stands for. If you are ever trying to explain something to the general population and you need LaTeX to do it, stop what you are doing and come up with a new plan. Seven paragraphs into that page we get the odds form with the colon, and it’s for three different hypothesis; I’m aware you can write odds like 3:2:4 but that’s less common. Drunk people who flunked high school routinely calculate 3:2 in pubs! Start with the two hypothesis version, then maybe mention that you can do three hypotheses at once. “Shortest Goddamn Bayes Guide Ever” uses strictly symbols on a standard keyboard and math which is within the limits of an on-track fourth grader. It’s less than two hundred words! The thing would fit in three tweets!
I think that is a masterwork of pedagogy and editing, worthy of praise and prominent place.
If there’s a way to make this version work for non-naive updates that seems good, and my understanding is it’s mostly about saying for each new line “given that the above has happened, what are the odds of this observation?” instead of “what are the odds of this observation assuming I haven’t seen the above”? It’s not like the P(A|B) formulation prevents people from making that exact mistake. (Citation, I have made that exact mistake.)
If there’s a way to make this version work for non-naive updates that seems good, and my understanding is it’s mostly about saying for each new line “given that the above has happened, what are the odds of this observation?”
Yes that’s it. Yeah I am not trying to defend the probability version of bayes rule. When I was trying to explain bayes rule to my wordcel gf, I was also using the odds ratio.
It’s not the programming notation that makes it work for me (though that helps a little.) It’s not the particular example either, though I do think it’s a bit better than the abstract mammogram example. There’s just way fewer numbers.
It’s because the notation on each line contains two numbers, both of which are. . . primitives? atomic pieces? I can do them in one step. (My inner monologue goes something like “3:2 means the the first thing happens three times for every two times the second thing happens, over the long run anyway. There’s five balls in the bag, three of the first colour and two of the second. Now more balls than that, but keep the ratio.”)
And then if I want to do an update, I just need four numbers, each of which makes sense on their own, each of which is used in one place. 1:100, 20:50, multiply the left by the left (20) and the right by the right (5000) and now I have two numbers again. (20:5000) I can usually simplify that in my head (2:500, okay now 1:250.) The line “The colon is a thick & rubbery barrier. Yep with yep and nope with nope” helps a lot, I’m reminded to keep all the yeps on the left and the nopes on the right. Because multiplication is transitive, I can just keep doing that at each new update, never dealing with more than four numbers. If I’d rather (or if I’m using pen and paper) I can just write out a dozen updates and get the products after.
Compare this sucker:
Four numbers used in six places. I’ll be the village idiot and admit I cannot reliably keep a phone number in my head without mental tricks. I have lost count of the number of times I have swapped P(A|B) and P(B|A) accidentally. The numbers aren’t arranged on the page in a way that helps my intuition, like yeps being on top and nopes on the bottom or something.
Or compare the explanation at the first link you shared.
O(H)×Le(H)=O(H∣e)O(H)× Le1(H)× Le2(H∧e1)× Le3(H∧e1∧e2)=O(H∣e1∧e2∧e3)I am trying to express I find that more complicated. I don’t know what Le means. It took me a bit to remember what O stands for. If you are ever trying to explain something to the general population and you need LaTeX to do it, stop what you are doing and come up with a new plan. Seven paragraphs into that page we get the odds form with the colon, and it’s for three different hypothesis; I’m aware you can write odds like 3:2:4 but that’s less common. Drunk people who flunked high school routinely calculate 3:2 in pubs! Start with the two hypothesis version, then maybe mention that you can do three hypotheses at once. “Shortest Goddamn Bayes Guide Ever” uses strictly symbols on a standard keyboard and math which is within the limits of an on-track fourth grader. It’s less than two hundred words! The thing would fit in three tweets!
I think that is a masterwork of pedagogy and editing, worthy of praise and prominent place.
If there’s a way to make this version work for non-naive updates that seems good, and my understanding is it’s mostly about saying for each new line “given that the above has happened, what are the odds of this observation?” instead of “what are the odds of this observation assuming I haven’t seen the above”? It’s not like the P(A|B) formulation prevents people from making that exact mistake. (Citation, I have made that exact mistake.)
Interesting! Makes sense.
Yes that’s it. Yeah I am not trying to defend the probability version of bayes rule. When I was trying to explain bayes rule to my wordcel gf, I was also using the odds ratio.