For a quick reminder of the power of many independent trials, estimate and then answer the following question:
I have 2 biased coins in my pocket. The first comes up heads with probability 51%, while the second comes up heads with probability 49%. I take a coin out of my pocket, uniformly at random, and flip it a million times. I observe that it comes up heads 508,634 times. What is the probability that it is the first coin?
Under the usual convention of 95% significance, it’s neither :-D
> binom.test(508634, 1000000, 0.51)
Exact binomial test
data: 508634 and 1e+06
number of successes = 508634, number of trials = 1e+06, p-value = 0.006304
alternative hypothesis: true probability of success is not equal to 0.51
95 percent confidence interval:
0.5077 0.5096
sample estimates:
probability of success
0.5086
Very very close to 1, if the trials are truly independent. But (as Jaynes mentioned in PT:TLoS) there are ways of flipping a coin that systematically favour one side over the other, and you might be unwittingly doing something like that. IOW inside the argument the probability that you took the second coin is negligible, but outside the argument it isn’t.
Gut: vg’f tbvat gb or fhssvpvragyl pybfr gb bar gung gur cebonovyvgl vf whfg tbvat gb ybbx yvxr n ohapu bs avarf gb zr.
Fermi estimate: VVEP bar qrpvory vf nobhg svsgl fvk creprag, naq crepragf punatr snfgre jura qrpvoryf ner fznyy, fb svsgl bar creprag vf yrff guna unys n qrpvory. Fb yrg’f fnl gung svsgl bar creprag vf nobhg mreb cbvag bar qo. Fb rnpu urnq cebivqrf abhtug cbvag bar qrpvoryf sbe pbva bar, naq rnpu gnvy cebivqrf artngvir gung sbe gur fnzr. Lbh unir nobhg gjragl gubhfnaq zber urnqf guna gnvyf, fb nobhg gjb gubhfnaq qo rivqrapr sbe pbva bar. Rirel gra qrpvoryf pbeerfcbaqf gb vapernfvat gur bqqf ol n snpgbe bs gra, fb cebonovyvgl vf gura fbzrguvat yvxr bar zvahf gra gb gur zvahf gjb uhaqerq.
Calculation: svsgl bar creprag vf abhtug cbvag bar frira qrpvoryf. Gurer ner gjb gvzrf rvtug gubhfnaq, fvk uhaqerq naq guvegl sbhe zber urnqf guna gnvyf. Gung znxrf nyzbfg rknpgyl guerr gubhfnaq qrpvoryf sbe urnqf. Fb gur bqqf ner gra gb gur guerr uhaqerq, tvivat cebonovyvgl nyzbfg rknpgyl bar zvahf gra gb gur zvahf guerr uhaqerq.
Compared to Manfred’s answer: Jr qvssre ol n snpgbe bs gra gb gur gjb uhaqerq naq guerr. V qba’g pheeragyl unir gur gvzr gb jbex bhg jul, naq jura V qb unir gvzr V znl abg pner rabhtu.
Edit: I think I did it wrong. No time to correct it currently, but the true answer should be higher than mine.
Edit 2: Maybe not? I thought I needed to use both P(H|C1)/P(H|C2) and P(H|C1)/P(T|C1), which are confusingly identical. But when I actually put it on paper, it looks correct.
I have us being different by a factor of 10^40, but yeah, that’s a bit surprising. Maybe we’re far enough out in the tails that the normal approximation is breaking down?
I don’t offhand have a model for why we expect your method to work, so I don’t know why it fails. But another approach using the normal approximation gets within a factor of 10, so that shouldn’t be it.
Um, I think you’re just counting standard deviations in the wrong direction? You’re counting standard deviations from 500,000 and doubling them, but the relevant distribution means are 510,000 and 490,000.
But no, those should be equivalent.
Oh! You’re squaring a sum, not summing a square. You’re counting the correct number of standard deviations in total, but you need the correct number for each distribution.
Do you take into account the possibility that you miscounted, or are hallucinating, or any of the other events that are far more likely explanations than that it comes up heads with probability 49% and it came up heads that often just by chance?
When I know I’m to be visited by one of my parents and I see someone who looks like my mother, should my first thought be “that person looks so unlike my father that maybe it is him and I’m having a stroke”? Should I damage my eyes to the point where this phenomenon doesn’t occur to spare myself the confusion?
If you’re being asked to estimate the probability that you’re being visited by your father then yes you probably should be considering the possibility that you are seeing him but you’re having a stroke.
This comment rubbed me the wrong way and I couldn’t figure out why at first, which is why I went for a pithy response.
I think what’s going on is I was reacting to the pragmatics of your exchange with Coscott. Coscott informally specified a model and then asked what we could conclude about a parameter of interest, which coin was chosen, given a sufficient statistic of all the coin toss data, the number of heads observed.
This is implicitly a statement that model checking isn’t important in solving the problem, because everything that could be used for model checking, e.g., statistics on runs to verify independence, the number of tails observed to check against a type of miscounting where the number of tosses don’t add to 1,000,000, mental status inventories to detect hallucination, etc., is left out of the statistic communicated.
Maybe Coscott (the fictional version who flipped all those coins) did model checking or maybe not, but if it was done and the data suggested miscounting or hallucination, then Coscott wouldn’t have stated the problem like this.
So, yeah, the points you raise are valid object-level ones, but bringing them up this way in a problem poser / problem solver context was really unexpected and seemed to violate the norms for this sort of exchange.
I suppose my point was that assuming normal distribution can give you far more extreme probabilities than could ever realistically be justified. It would probably be better if I just said it like that.
For a quick reminder of the power of many independent trials, estimate and then answer the following question:
I have 2 biased coins in my pocket. The first comes up heads with probability 51%, while the second comes up heads with probability 49%. I take a coin out of my pocket, uniformly at random, and flip it a million times. I observe that it comes up heads 508,634 times. What is the probability that it is the first coin?
Under the usual convention of 95% significance, it’s neither :-D
Very very close to 1, if the trials are truly independent. But (as Jaynes mentioned in PT:TLoS) there are ways of flipping a coin that systematically favour one side over the other, and you might be unwittingly doing something like that. IOW inside the argument the probability that you took the second coin is negligible, but outside the argument it isn’t.
Gut: vg’f tbvat gb or fhssvpvragyl pybfr gb bar gung gur cebonovyvgl vf whfg tbvat gb ybbx yvxr n ohapu bs avarf gb zr.
Fermi estimate: VVEP bar qrpvory vf nobhg svsgl fvk creprag, naq crepragf punatr snfgre jura qrpvoryf ner fznyy, fb svsgl bar creprag vf yrff guna unys n qrpvory. Fb yrg’f fnl gung svsgl bar creprag vf nobhg mreb cbvag bar qo. Fb rnpu urnq cebivqrf abhtug cbvag bar qrpvoryf sbe pbva bar, naq rnpu gnvy cebivqrf artngvir gung sbe gur fnzr. Lbh unir nobhg gjragl gubhfnaq zber urnqf guna gnvyf, fb nobhg gjb gubhfnaq qo rivqrapr sbe pbva bar. Rirel gra qrpvoryf pbeerfcbaqf gb vapernfvat gur bqqf ol n snpgbe bs gra, fb cebonovyvgl vf gura fbzrguvat yvxr bar zvahf gra gb gur zvahf gjb uhaqerq.
Calculation: svsgl bar creprag vf abhtug cbvag bar frira qrpvoryf. Gurer ner gjb gvzrf rvtug gubhfnaq, fvk uhaqerq naq guvegl sbhe zber urnqf guna gnvyf. Gung znxrf nyzbfg rknpgyl guerr gubhfnaq qrpvoryf sbe urnqf. Fb gur bqqf ner gra gb gur guerr uhaqerq, tvivat cebonovyvgl nyzbfg rknpgyl bar zvahf gra gb gur zvahf guerr uhaqerq.
Compared to Manfred’s answer: Jr qvssre ol n snpgbe bs gra gb gur gjb uhaqerq naq guerr. V qba’g pheeragyl unir gur gvzr gb jbex bhg jul, naq jura V qb unir gvzr V znl abg pner rabhtu.
Edit: I think I did it wrong. No time to correct it currently, but the true answer should be higher than mine.
Edit 2: Maybe not? I thought I needed to use both P(H|C1)/P(H|C2) and P(H|C1)/P(T|C1), which are confusingly identical. But when I actually put it on paper, it looks correct.
I have us being different by a factor of 10^40, but yeah, that’s a bit surprising. Maybe we’re far enough out in the tails that the normal approximation is breaking down?
Oh, I misbracketed your formula. Yes, 10^40.
I don’t offhand have a model for why we expect your method to work, so I don’t know why it fails. But another approach using the normal approximation gets within a factor of 10, so that shouldn’t be it.
Um, I think you’re just counting standard deviations in the wrong direction? You’re counting standard deviations from 500,000 and doubling them, but the relevant distribution means are 510,000 and 490,000.
But no, those should be equivalent.
Oh! You’re squaring a sum, not summing a square. You’re counting the correct number of standard deviations in total, but you need the correct number for each distribution.
Dammit LW, stop nerd sniping me.
Oh yeah, whoops.
And also, muahaha, complete success.
Do you take into account the possibility that you miscounted, or are hallucinating, or any of the other events that are far more likely explanations than that it comes up heads with probability 49% and it came up heads that often just by chance?
When I know I’m to be visited by one of my parents and I see someone who looks like my mother, should my first thought be “that person looks so unlike my father that maybe it is him and I’m having a stroke”? Should I damage my eyes to the point where this phenomenon doesn’t occur to spare myself the confusion?
If you’re being asked to estimate the probability that you’re being visited by your father then yes you probably should be considering the possibility that you are seeing him but you’re having a stroke.
This comment rubbed me the wrong way and I couldn’t figure out why at first, which is why I went for a pithy response.
I think what’s going on is I was reacting to the pragmatics of your exchange with Coscott. Coscott informally specified a model and then asked what we could conclude about a parameter of interest, which coin was chosen, given a sufficient statistic of all the coin toss data, the number of heads observed.
This is implicitly a statement that model checking isn’t important in solving the problem, because everything that could be used for model checking, e.g., statistics on runs to verify independence, the number of tails observed to check against a type of miscounting where the number of tosses don’t add to 1,000,000, mental status inventories to detect hallucination, etc., is left out of the statistic communicated.
Maybe Coscott (the fictional version who flipped all those coins) did model checking or maybe not, but if it was done and the data suggested miscounting or hallucination, then Coscott wouldn’t have stated the problem like this.
So, yeah, the points you raise are valid object-level ones, but bringing them up this way in a problem poser / problem solver context was really unexpected and seemed to violate the norms for this sort of exchange.
I suppose my point was that assuming normal distribution can give you far more extreme probabilities than could ever realistically be justified. It would probably be better if I just said it like that.
Shortcut: Gurfr ner onfvpnyyl abezny qvfgevohgvbaf jvgu fgnaqneq qrivngvba fdeg(a)/2 = svir uhaqerq (gur snpgbe bs 1⁄2 pbzrf sebz gur c(1-c) grez va gur inevnapr). Gur qvssrerapr orgjrra gur gjb vavgvnyyl-rdhny cbffvovyvgvrf vf gura nobhg 34 fgnaqneq qrivngvbaf. Jung’f bar zvahf r gb gur zvahf 34 fdhnerq bire 2?