I looooove that coin flip section! Cheers
Shoot! You’re right! I think I was wrong this whole time on the impact of dropping the prior term. Cuz data term * prior term is like multiplying the distributions, and dropping the prior term is like multiplying the data distribution by the uniform one. Thanks for sticking with me :)
Now I’m doubting myself >_> is it pretty different?? Anyone lurking reading this who knows whether uniform prior is very different than just dropping the prior term?
It seems like in practice, when there’s a lot of data, people like Jaynes and Gelman are happy to assign low-information (or “uninformative”) priors, knowing that with a lot of data the prior ends up getting washed away anyway. So just slapping a uniform prior down might be OK in a lot of real-world situations. This is I think pretty different than just dropping the prior completely, but gets the same job done.
I can’t see anything wrong in what you’ve said there, but I still have to insist without good argument that dropping P(A_p|I) is incorrect. In my vague defense, consider the two A_p distributions drawn on p558, for the penny and for Mars. Those distributions are as different as they are because of the different prior information. If it was correct to drop the prior term a priori, I think those distributions would look the same?
Isn’t A_p the distribution over how often the coin will come up heads, or the probability of life on Mars? If so… there’s no way those things could be indifferent to the background information. A core tenet of the philosophy outlined in this book is that when you ignore prior information without good cause, things get wacky and fall apart. This is part of desiderata iii from chapter 2: “The robot always takes into account all of the evidence it has relevant to a question. It does not arbitrarily ignore some of the information, basing its conclusions only on what remains.”
(Then Jaynes ignores information in later chapters because it doesn’t change the result… so this desideratum is easier said than done… but yeah)
Ah, wait, I misunderstood. You’re interested in the mode, huh—that’s why you’re taking the argmax. In my Beta(3,1) example, the mode is also 1. So no problem there. I was focused on the mean in my previous comment. I still think dropping the prior is bad but now I’m not sure how to argue the point…
I don’t think so. Like you, I don’t really understand this Ap stuff philisophically. But the step where you drop the prior P(Ap|I) to obtain P(Ap|D,I)∝P(D|Ap,I) is, I think, not warranted. Dropping the prior term outright like that… I don’t think there are many cases where that’s acceptable. Doing so does not reflect a state of low knowledge, but instead a state of pretty strong knowledge. To give intuition on what I mean:Contrast with the prior that reflects the state of knowledge “All I know is that H is possible and T is possible”. This is closer to Jaynes’ example about whether there’s life on Mars. The prior that reflects that state of knowledge is Beta(1,1), which after two heads come up, becomes Beta(3, 1). The mean of Beta(3, 1) is 3⁄4 = 0.75. This is much less than the 1.0 you arrive at. A prior that gives 1.0 after the data H,H might be something like:”This coin is very unfair in a well-known, specific way: It either always gives heads, always gives tails, or gives heads and tails alternating: ‘H,T,H,T...’.” Under that prior, the data HH would give you a probability of near-1 that H is next. But that’s a prior that reflects definite, strong knowledge of the coin.Maybe this argument changes given the nature of Ap, which again I don’t really understand. But whatever it is, I don’t think it’s valid to assume the prior away.
Jaynes has a wonderful section in the same book where he discusses coin-flipping in depth. He flips a pickle jar lid in his kitchen in different ways to demonstrate how the method of flipping is critical—I love this whole section—and ends by saying that it’s a “problem of mechanics, highly complicated”. Section 10.3 (p317), How to cheat at coin and die tossing.
I’d thought he talked about this kind of “probability of a probability” kind of thing in the Chapter on the A_p distribution, and page 560 does have that phrase (though later on the page he says “The term ‘probability of a probability’ misses the point”…), but reading it again now it seems like I didn’t really understand this section. But give pages 560-563 a shot anyway.
Very interesting! Thanks!
Thanks! I’ve been vaguely thinking I’d like to be able to cycle but I think I have to reduce my dependence first, to not be dead on the off days.
That helps me better see what you mean—thanks.
The focus on caffeine’s effects through their action on adenosine binding is a useful frame that I hadn’t been thinking in terms of—thanks!However, when you say:Taking more caffeine in the afternoon is counterproductive, since it is maintaining the same high blood concentrations of active molecules. To reduce resistance you need to give your body time at low levels of stimulator molecules so it gets rid of the excess receptor sites.If I take caffeine in the afternoon, like, 4 hours after the morning dose, sure, I get it, that’s bad. But if I take it 10 hours after, wouldn’t I be giving my body time at (relatively) lower levels, thus prompting it to get rid of the excess sites? The instruction to give my body time at lower levels does not (at my level of understanding) tell me whether total daily intake or maximum dose is more important. Does that make sense?
I’m actually not sure I can and it feels rough already to decrease. My “so far so good” was more meant as like… “so far, logically straightforward”, not “so far, good clear progress”. My life is a little bit consistently worse under these decreases and I do want to finish the reductions sooner rather than later, without accidentally burning progress through use of night caffeine.
Love this post! So linear and so many examples made it so easy to read! Also I was vaguely annoyed at the term Outside View but didn’t know why or whether I was right or anything? This expansion of it into parts makes a lot of sense.
Yup yup—I was wondering if there was some weird less-known but persuasive reason it might be dangerous, so thought I’d do a double-check here. Cheers!
Definitely looks relevant! Thanks!