So I was trying to adjust for longer term questions being easier by doing the follow:
For each question, calculate the average Brier score for available predictions
For each prediction, calculate the accuracy score as Brier score—average Brier scores of the question.
Correlate accuracy score with range. So I was trying to do that, and I thought, well, I might as well run the correlation between accuracy score and log range. But then some of the ranges are negative, which shouldn’t be the case.
Anyways, if I adjust for question difficulty, results are as you would expect; accuracy is worse the further removed the forecast is from the resolution.
So I was trying to adjust for longer term questions being easier by doing the follow:
For each question, calculate the average Brier score for available predictions
For each prediction, calculate the accuracy score as Brier score—average Brier scores of the question.
Correlate accuracy score with range. So I was trying to do that, and I thought, well, I might as well run the correlation between accuracy score and log range. But then some of the ranges are negative, which shouldn’t be the case.
Anyways, if I adjust for question difficulty, results are as you would expect; accuracy is worse the further removed the forecast is from the resolution.