I second conchis’s request. Shouldn’t the second method cut against assumption of a randomized sample?
I’m also thinking of an analogy to the problem of only reporting studies that demonstrate the effectiveness of a drug, even if each of those studies on its own is fair. It seems to me as if stopping when and only when one gets the results one wants is similarly problematic, once again even if everything else about the experiment is strictly ok; outcomes that show 60%+ effectiveness are favored under that method, so P(real effectiveness!=60%|experimental effectiveness=60%) should be increased. Now, I understand why the effect here would be much smaller than in the case of simply leaving out inconvenient data, but I don’t understand why I should think that it is exactly equal to zero.
I second conchis’s request. Shouldn’t the second method cut against assumption of a randomized sample?
I’m also thinking of an analogy to the problem of only reporting studies that demonstrate the effectiveness of a drug, even if each of those studies on its own is fair. It seems to me as if stopping when and only when one gets the results one wants is similarly problematic, once again even if everything else about the experiment is strictly ok; outcomes that show 60%+ effectiveness are favored under that method, so P(real effectiveness!=60%|experimental effectiveness=60%) should be increased. Now, I understand why the effect here would be much smaller than in the case of simply leaving out inconvenient data, but I don’t understand why I should think that it is exactly equal to zero.