A Proper Scoring Rule for Confidence Intervals

You prob­a­bly already know that you can in­cen­tivise hon­est re­port­ing of prob­a­bil­ities us­ing a proper scor­ing rule like log score, but did you know that you can also in­cen­tivize hon­est re­port­ing of con­fi­dence in­ter­vals?

To in­cen­tize re­port­ing of a con­fi­dence in­ter­val, take the score , where is the size of your con­fi­dence in­ter­val, and is the dis­tance be­tween the true value and the in­ter­val. is when­ever the true value is in the in­ter­val.

This in­cen­tivizes not only giv­ing an in­ter­val that has the true value of the time, but also dis­tributes the re­main­ing 10% equally be­tween over­es­ti­mates and un­der­es­ti­mates.

To keep the lower bound of the in­ter­val im­por­tant, I recom­mend mea­sur­ing and in log space. So if the true value is and the in­ter­val is , then is and is for un­der­es­ti­mates and for over­es­ti­mates. Of course, you need ques­tions with pos­i­tive an­swers to do this.

To do a con­fi­dence in­ter­val, take the score .

This can be used to make train­ing cal­ibra­tion, us­ing some­thing like Wits and Wagers cards more fun. I also think it could be turned into app, if one could get a large list of ques­tions with nu­mer­i­cal val­ues.