SilasBarta comments on Case study: abuse of frequentist statistics

SilasBarta 22 Feb 2010 0:30 UTC
4 points
0
Okay, I think that makes sense. Let me put it into my own words:

The test is guaranteed to be not statistically significant merely by virtue of cutting up the outcome space into pieces, each of which has at least 5% chance of happening. And further, because the null hypothesis has been (arbitrarily) defined to be “the two methods are the same”, statistical insignificance means a favorable result.

Does that about cover it? If so, that’s pretty bad.
- Cyan 22 Feb 2010 1:46 UTC
  0 points
  0
  Parent
  
  each of which has at least 5% chance of happening
  
  That part isn’t right, but the rest is.
  - SilasBarta 22 Feb 2010 1:55 UTC
    0 points
    0
    Parent
    So I should have said “for the nine outcomes they considered, they all had at least 5% chance of happening”?
    - [deleted] 22 Feb 2010 2:28 UTC
      1 point
      0
      Parent
      The p-value is the probability of getting a result “at least this extreme” given the null hypothesis, where “extreme” means “deviating from the null hypothesis”, however that’s defined. So, the test cut the outcome space into pieces, the most extreme of which had at least a 5% chance of happening.
      
      I think.
      - Cyan 22 Feb 2010 2:33 UTC
        3 points
        0
        Parent
        
        the most extreme of which had at least a 5% chance of happening
        
        … under the null hypothesis. I actually forgot this detail when replying to komponisto.
        What links here?
        Cyan's comment on Case study: abuse of frequentist statistics by Cyan (21 Feb 2010 18:13 UTC; 10 points)
- Psy-Kosh 22 Feb 2010 1:01 UTC
  0 points
  0
  Parent
  Wait… actually it may even be worse than that. I’m not even sure it’s cleanly partitioning the outcome space. ¹⁄₂₀ = .05, so if some outcomes are above .05, then other outcomes would have to be below .05, right?
  
  So the calculation to get the final result doesn’t even really do a proper partitioning of the outcomes if some of the outcomes can be greater than .05 and none less than .05
  
  EDIT: so yeah, it’s cutting up not just the outcome space into pieces corresponding to rankings, but mushing some of those together (at best).
- Psy-Kosh 22 Feb 2010 0:38 UTC
  0 points
  0
  Parent
  That’s more or less my understanding of the situation.
  
  And yes… that is indeed pretty bad. :)