You know what really helps me accept a counterintuitive conclusion? Doing the math. I spent an hour reading and rereading this post and the arguments without being fully convinced of Eliezer’s position, and then I spent 15 minutes doing the math (R code attached at the end). And once the math came out in favor of Eliezer, the conclusion suddenly doesn’t seem so counterintuitive :)
Here we go, I’m diving all the numbers by five to make the code work but it’s pretty convincing either way.
The setup—Researcher A does 20 trials always, researcher B keeps doing trials until the ratio of cures is at least 70% (1 cure / 1 trial is also acceptable).
E—The full evidence, namely that 20 patients were tried and 14 were cured.
H0 - The hypothesis that the success rate of the cure is 60%.
H1 - The hypothesis that the success rate is 70%.
Pa—Researcher A’s probabilities.
Pb—Researcher B’s probabilities.
In this setup, it’s clear to see that Pa and Pb aren’t equal for every thing you want to measure. For example, for any evidence E that doesn’t contain 20 observations Pa(E)=0. However, Reverend Bayes reminds us that the strength of our EVIDENCE depends on the odds ratio, and not on all the sub probabilities:
P(H1|A) / P(H0|B) = P(H1)/P(H0) P(E|H1)/P(E|H0) aka posterior odds = prior odds odds ratio of evidence. Assuming that the prior odds are the same, let’s calculate the odds ratio for both Pa and Pb and see if they are different.
Pa(E|H0) = 12.4%, as a simple binomial distribution: dbinom(14,20,0.6). Pa(E|H1) = 19.1%.
The odds ratio: Pa(E|H1)/Pa(E|H0) = 1.54. That’s the only measure of how much our posterior should change. If originally we gave each hypothesis an equal chance (1:1), we now favor H1 at a ratio of 1.54:1. In terms of probability, we changed our credence in H1 from 50% to 60.6%.
What about researcher B? I simulated researcher B a million times in each possible world, the H0 world and the H1 world. In the H0 world, evidence E occurred only 5974 times out of a million, for Pb(E|H0) = 0.597% which is very far from 12.4%. It makes sense: researcher 2 usually stops after the first trial, and occasionally goes on for zillions! What about the H1 world? Pb(E|H1) = 0.919%. The odds ratio: Pb(E|H1) / Pb(E|H0) = wait for it = 1.537. Exactly the same!
I think all the other posts explain quite well why this was obviously the case, but if you like to see the numbers back up one side of an argument, you got ’em. I personally am now converted, amen.
R code for simulating a single researcher B:
resb<-function(p=0.6){
cures<-0
tries<-0
while(tries < 21) { # Since we only care whether B stops after 20 trials, we don’t need to simulate past 21.
You know what really helps me accept a counterintuitive conclusion? Doing the math. I spent an hour reading and rereading this post and the arguments without being fully convinced of Eliezer’s position, and then I spent 15 minutes doing the math (R code attached at the end). And once the math came out in favor of Eliezer, the conclusion suddenly doesn’t seem so counterintuitive :)
Here we go, I’m diving all the numbers by five to make the code work but it’s pretty convincing either way.
The setup—Researcher A does 20 trials always, researcher B keeps doing trials until the ratio of cures is at least 70% (1 cure / 1 trial is also acceptable).
E—The full evidence, namely that 20 patients were tried and 14 were cured.
H0 - The hypothesis that the success rate of the cure is 60%.
H1 - The hypothesis that the success rate is 70%.
Pa—Researcher A’s probabilities.
Pb—Researcher B’s probabilities.
In this setup, it’s clear to see that Pa and Pb aren’t equal for every thing you want to measure. For example, for any evidence E that doesn’t contain 20 observations Pa(E)=0. However, Reverend Bayes reminds us that the strength of our EVIDENCE depends on the odds ratio, and not on all the sub probabilities:
P(H1|A) / P(H0|B) = P(H1)/P(H0) P(E|H1)/P(E|H0) aka posterior odds = prior odds odds ratio of evidence. Assuming that the prior odds are the same, let’s calculate the odds ratio for both Pa and Pb and see if they are different.
Pa(E|H0) = 12.4%, as a simple binomial distribution: dbinom(14,20,0.6). Pa(E|H1) = 19.1%. The odds ratio: Pa(E|H1)/Pa(E|H0) = 1.54. That’s the only measure of how much our posterior should change. If originally we gave each hypothesis an equal chance (1:1), we now favor H1 at a ratio of 1.54:1. In terms of probability, we changed our credence in H1 from 50% to 60.6%.
What about researcher B? I simulated researcher B a million times in each possible world, the H0 world and the H1 world. In the H0 world, evidence E occurred only 5974 times out of a million, for Pb(E|H0) = 0.597% which is very far from 12.4%. It makes sense: researcher 2 usually stops after the first trial, and occasionally goes on for zillions! What about the H1 world? Pb(E|H1) = 0.919%. The odds ratio: Pb(E|H1) / Pb(E|H0) = wait for it = 1.537. Exactly the same!
I think all the other posts explain quite well why this was obviously the case, but if you like to see the numbers back up one side of an argument, you got ’em. I personally am now converted, amen.
R code for simulating a single researcher B:
resb<-function(p=0.6){
cures<-0
tries<-0
while(tries < 21) { # Since we only care whether B stops after 20 trials, we don’t need to simulate past 21.
}
tries }
R code for simulating a million researchers B in H1 world:
x<-sapply(1:1000000,function(i) {resb(0.7)})
length(x[x==20])