[deleted] comments on 2012 Survey Results

[deleted] 29 Nov 2012 15:26 UTC
10 points
Yvain, I rechecked the calibration survey results, and encourage someone to recheck my recheck further:

First, these strata overlap… is 5 in 0-5 or 5-15? The N I doesn’t actually match either one get either one when I recheck.

Secondly, I am not sure what program you used to calculate the statistics, but when I checked in excel, some people used percentages that got pulled as numbers less than one. I tried to clean that for these. (also removed someone who answered 150.)

Thirdly, there are 20 people in this N. You can be either 60% correct (12 correct), or 65% correct (13 correct), but 60.2% correct in this line seems weird. 85-95: 60.2% [n = 20]

Here was my attempt at recalculating those figures: N after data cleaning was 998.

0-<5: 9.1% [n = ²⁄₂₂]

5-<15: 13.7% [n = ²⁵⁄₁₈₃]

15-<25: 9.3% [n = ²¹⁄₂₂₆]

25-<35: 10% [n = ²⁰⁄₂₀₀]

35-<45: 11.1% [n = ¹⁰⁄₉₀]

45-<55: 17.3% [n = ¹⁹⁄₁₁₀]

55-<65: 20.8% [n = ¹¹⁄₅₃]

65-<75: 22.6% [n = ⁷⁄₃₁]

75-<85: 36.7% [n = ¹¹⁄₃₀]

85-<95: 63.2% [n = ¹²⁄₁₉]

95-100: 88.2% [n = ³⁰⁄₃₄]

I express low confidence in these remarks because I haven’t rechecked this or gone into detail about data cleaning, but my brief take is:

1: Yes, there were some errors that made it look a bit worse than it was.

2: It’s still shows overconfidence. (Edit: see possible caveat below)

Question: Do we have enough data to determine if that hump at near 10% confidence that you are right is significant?

Edit: I’m not a statistician, but I do notice there appears to be substantially more N that answered in the lower confidence ranges. I mean, yes, on average, the people who answered in those high 55-<85 ranges were quite far off, but there were more people than answered in the 15-<25 range then all of those three groups put together.
- gwern 30 Nov 2012 1:36 UTC
  5 points
  Parent
  I think the calibration data needs additional cleaning. Eyeballing, I see % signs, decimals, and English comments.