A Proper Scoring Rule for Confidence Intervals

You probably already know that you can incentivise honest reporting of probabilities using a proper scoring rule like log score, but did you know that you can also incentivize honest reporting of confidence intervals?

To incentize reporting of a $90 %$ confidence interval, take the score $- S - 20 \cdot D$ , where $S$ is the size of your confidence interval, and $D$ is the distance between the true value and the interval. $D$ is $0$ whenever the true value is in the interval.

This incentivizes not only giving an interval that has the true value $90 %$ of the time, but also distributes the remaining 10% equally between overestimates and underestimates.

To keep the lower bound of the interval important, I recommend measuring $S$ and $D$ in log space. So if the true value is $T$ and the interval is $(L, U)$ , then $S$ is $log (\frac{U}{L})$ and $D$ is $log (\frac{T}{U})$ for underestimates and $log (\frac{L}{T})$ for overestimates. Of course, you need questions with positive answers to do this.

To do a $P %$ confidence interval, take the score $- S - \frac{200}{100 - P} \cdot D$ .

This can be used to make training calibration, using something like Wits and Wagers cards more fun. I also think it could be turned into app, if one could get a large list of questions with numerical values.