I’ll take fifth place in Round 1 (#44), given how little I thought of my execution XD. Debiasing algorithms work. I’m not convinced there’s a real detected difference between the top Round 1 participants anyway; we are beating most random strings, and none of the top players thought it was more likely than not any of them were human.
My Round 2 performance was thoroughly middle of the pack, with a disappointing negative score. I didn’t spend much effort on it and certainly didn’t attempt calibration, so it’s not a huge surprise I didn’t win, but I still hoped for a positive score. What I am most surprised at is that four of my 0% scores were real (#8, #61, #121, #122). I was expecting one, maybe two (yes, yes, I already said ‘I didn’t attempt calibration’) might be wrong, but four seems excessive. I can’t really blame calibration for the mediocre performance, since my classification rate (60.5%) was also middle of the road, but I think I underestimated how much bang-for-the-buck I would have gotten from calibration, rather than working on the details.
Perhaps interestingly, someone who bet the mean % for every option (excluding self-guesses), with no weighting, would have scored 19.5 (drawn fourth place), or 19.8 post-squeeze, with a 64.5% classification rate. Even if you exclude everyone who scored 10 or more from that average, the average would have scored 14.5, or 15.8 post-squeeze, with (only) a 59.7% classification rate. So averaging out even a bunch of mediocre opinions seems to get you pretty decent, mostly-well-calibrated results.
Alternatively, someone who bet the weighted average from the column in the sheet, which is of course a strategy impossible to implement without cheating, would have scored 27.7, or 28.0 post-squeeze, with a 74.2% classification rate. So even that form of cheating wouldn’t beat the Scy & William duo.
I’ll take fifth place in Round 1 (#44), given how little I thought of my execution XD. Debiasing algorithms work. I’m not convinced there’s a real detected difference between the top Round 1 participants anyway; we are beating most random strings, and none of the top players thought it was more likely than not any of them were human.
My Round 2 performance was thoroughly middle of the pack, with a disappointing negative score. I didn’t spend much effort on it and certainly didn’t attempt calibration, so it’s not a huge surprise I didn’t win, but I still hoped for a positive score. What I am most surprised at is that four of my 0% scores were real (#8, #61, #121, #122). I was expecting one, maybe two (yes, yes, I already said ‘I didn’t attempt calibration’) might be wrong, but four seems excessive. I can’t really blame calibration for the mediocre performance, since my classification rate (60.5%) was also middle of the road, but I think I underestimated how much bang-for-the-buck I would have gotten from calibration, rather than working on the details.
Perhaps interestingly, someone who bet the mean % for every option (excluding self-guesses), with no weighting, would have scored 19.5 (drawn fourth place), or 19.8 post-squeeze, with a 64.5% classification rate. Even if you exclude everyone who scored 10 or more from that average, the average would have scored 14.5, or 15.8 post-squeeze, with (only) a 59.7% classification rate. So averaging out even a bunch of mediocre opinions seems to get you pretty decent, mostly-well-calibrated results.
Alternatively, someone who bet the weighted average from the column in the sheet, which is of course a strategy impossible to implement without cheating, would have scored 27.7, or 28.0 post-squeeze, with a 74.2% classification rate. So even that form of cheating wouldn’t beat the Scy & William duo.