RSS

Calibration

TagLast edit: 24 Dec 2023 22:32 UTC by Saul Munn

Someone is well-calibrated if the things they predict with X% chance of happening in fact occur X% of the time. Importantly, calibration is not the same as accuracy. Calibration is about accurately assessing how good your predictions are, not making good predictions. Person A, whose predictions are marginally better than chance (60% of them come true when choosing from two options) and who is precisely 60% confident in their choices, is perfectly calibrated. In contrast, Person B, who is 99% confident in their predictions, and right 90% of the time, is more accurate than Person A, but less well-calibrated.

See also: Betting, Epistemic Modesty, Forecasting & Prediction

Being well-calibrated has value for rationalists separately from accuracy. Among other things, being well-calibrated lets you make good bets /​ make good decisions, communicate information helpfully to others if they know you to be well-calibrated (See Group Rationality), and helps prioritize which information is worth acquiring.

Note that all expressions of quantified confidence in beliefs can be well- or poorly- calibrated. For example, calibration applies to whether a person’s 95% confidence intervals captures the true outcome 95% of the time.

List of Calibration Exercises

(From: https://​​www.lesswrong.com/​​posts/​​LdFbx9oqtKAAwtKF3/​​list-of-probability-calibration-exercises)

Exercises that are dead/​unmaintained:

We Change Our Minds Less Often Than We Think

Eliezer Yudkowsky3 Oct 2007 18:14 UTC
99 points
119 comments1 min readLW link

Illu­sion of Trans­parency: Why No One Un­der­stands You

Eliezer Yudkowsky20 Oct 2007 23:49 UTC
160 points
52 comments3 min readLW link

Qual­i­ta­tively Confused

Eliezer Yudkowsky14 Mar 2008 17:01 UTC
59 points
83 comments4 min readLW link

Say It Loud

Eliezer Yudkowsky19 Sep 2008 17:34 UTC
61 points
20 comments2 min readLW link

Hor­rible LHC Inconsistency

Eliezer Yudkowsky22 Sep 2008 3:12 UTC
34 points
33 comments1 min readLW link

Lawful Uncertainty

Eliezer Yudkowsky10 Nov 2008 21:06 UTC
101 points
57 comments4 min readLW link

The Sin of Underconfidence

Eliezer Yudkowsky20 Apr 2009 6:30 UTC
99 points
187 comments6 min readLW link

Test Your Cal­ibra­tion!

alyssavance11 Nov 2009 22:03 UTC
25 points
33 comments2 min readLW link

Cal­ibra­tion for con­tin­u­ous quantities

Cyan21 Nov 2009 4:53 UTC
30 points
13 comments3 min readLW link

Ad­vanc­ing Certainty

komponisto18 Jan 2010 9:51 UTC
44 points
110 comments4 min readLW link

Cal­ibrate your self-assessments

Scott Alexander9 Oct 2011 23:26 UTC
101 points
122 comments6 min readLW link

Kurzweil’s pre­dic­tions: good ac­cu­racy, poor self-calibration

Stuart_Armstrong11 Jul 2012 9:55 UTC
50 points
39 comments9 min readLW link

Rais­ing the fore­cast­ing wa­ter­line (part 1)

Morendil9 Oct 2012 15:49 UTC
51 points
107 comments6 min readLW link

Over­con­fi­dent Pessimism

lukeprog24 Nov 2012 0:47 UTC
37 points
38 comments4 min readLW link

Cre­dence Cal­ibra­tion Ice­breaker Game

Ruby14 Aug 2014 21:01 UTC
42 points
1 comment2 min readLW link

Cal­ibra­tion Test with database of 150,000+ questions

Nanashi14 Mar 2015 11:22 UTC
54 points
31 comments1 min readLW link

Si­mul­ta­neous Over­con­fi­dence and Underconfidence

abramdemski3 Jun 2015 21:04 UTC
37 points
6 comments5 min readLW link

Au­mann Agree­ment Game

abramdemski9 Oct 2015 17:14 UTC
32 points
17 comments1 min readLW link

Plac­ing Your­self as an In­stance of a Class

abramdemski3 Oct 2017 19:10 UTC
35 points
5 comments3 min readLW link

RFC on an open prob­lem: how to de­ter­mine prob­a­bil­ities in the face of so­cial distortion

ialdabaoth7 Oct 2017 22:04 UTC
6 points
3 comments2 min readLW link

Ham­mer­time Day 9: Time Calibration

alkjash7 Feb 2018 1:40 UTC
15 points
8 comments2 min readLW link
(radimentary.wordpress.com)

A Mo­tor­cy­cle (and Cal­ibra­tion?) Accident

boggler18 Mar 2018 22:21 UTC
25 points
11 comments2 min readLW link

Pre­dic­tion Con­test 2018

jbeshir30 Apr 2018 18:26 UTC
9 points
4 comments3 min readLW link

[Question] What are good ML/​AI re­lated pre­dic­tion /​ cal­ibra­tion ques­tions for 2019?

james_t4 Jan 2019 2:40 UTC
19 points
4 comments2 min readLW link

[Question] Is there a.. more ex­act.. way of scor­ing a pre­dic­tor’s cal­ibra­tion?

mako yass16 Jan 2019 8:19 UTC
22 points
6 comments1 min readLW link

Pre­dic­tion Con­test 2018: Scores and Retrospective

jbeshir27 Jan 2019 17:20 UTC
28 points
5 comments1 min readLW link

Bayes-Up: An App for Shar­ing Bayesian-MCQ

Louis Faucon6 Feb 2020 19:01 UTC
53 points
9 comments1 min readLW link

Sus­pi­ciously bal­anced evidence

gjm12 Feb 2020 17:04 UTC
50 points
24 comments4 min readLW link

The Bayesian Tyrant

abramdemski20 Aug 2020 0:08 UTC
140 points
21 comments6 min readLW link1 review

How to reach 80% of your goals. Ex­actly 80%.

Bart Bussmann10 Oct 2020 17:33 UTC
36 points
11 comments1 min readLW link

In­for­ma­tion Charts

Rafael Harth13 Nov 2020 16:12 UTC
29 points
6 comments13 min readLW link

Pre­dic­tion and Cal­ibra­tion—Part 1

Jan Christian Refsgaard8 May 2021 19:48 UTC
6 points
10 comments4 min readLW link
(www.badprior.com)

So­cial Calibration

SimulatedCrow20 May 2021 23:22 UTC
3 points
4 comments4 min readLW link

Be­hav­ior Clon­ing is Miscalibrated

leogao5 Dec 2021 1:36 UTC
77 points
3 comments3 min readLW link

How the Equiv­a­lent Bet Test Ac­tu­ally Works

Erich_Grunewald18 Dec 2021 11:17 UTC
4 points
1 comment4 min readLW link
(www.erichgrunewald.com)

Use Nor­mal Predictions

Jan Christian Refsgaard9 Jan 2022 15:01 UTC
145 points
67 comments6 min readLW link

List of Prob­a­bil­ity Cal­ibra­tion Exercises

Isaac King23 Jan 2022 2:12 UTC
66 points
12 comments1 min readLW link

Giv­ing cal­ibrated time es­ti­mates can have so­cial costs

Alex_Altair3 Apr 2022 21:23 UTC
99 points
16 comments5 min readLW link

Paper: Teach­ing GPT3 to ex­press un­cer­tainty in words

Owain_Evans31 May 2022 13:27 UTC
97 points
7 comments4 min readLW link

Paper: Fore­cast­ing world events with neu­ral nets

1 Jul 2022 19:40 UTC
39 points
3 comments4 min readLW link

Cal­ibra­tion Trivia

Screwtape4 Aug 2022 22:31 UTC
11 points
9 comments3 min readLW link

In­tro­duc­ing Past­cast­ing: A tool for fore­cast­ing practice

Sage Future11 Aug 2022 17:38 UTC
95 points
10 comments2 min readLW link2 reviews

Quan­tified In­tu­itions: An epistemics train­ing web­site in­clud­ing a new EA-themed cal­ibra­tion app

20 Sep 2022 22:25 UTC
28 points
2 comments2 min readLW link

Cli­mate-con­tin­gent Fi­nance, and A Gen­er­al­ized Mechanism for X-Risk Re­duc­tion Financing

John Nay26 Sep 2022 13:23 UTC
0 points
2 comments1 min readLW link

Cal­ibrate—New Chrome Ex­ten­sion for hid­ing num­bers so you can guess

chanamessinger7 Oct 2022 11:21 UTC
59 points
16 comments1 min readLW link
(chrome.google.com)

Fair Col­lec­tive Effi­cient Altruism

Jobst Heitzig25 Nov 2022 9:38 UTC
2 points
1 comment5 min readLW link

Take­aways from cal­ibra­tion training

Olli Järviniemi29 Jan 2023 19:09 UTC
38 points
1 comment3 min readLW link

Ou­trangeous (Cal­ibra­tion Game)

jenn7 Mar 2023 15:29 UTC
32 points
3 comments9 min readLW link

Break­ing Rank (Cal­ibra­tion Game)

jenn7 Mar 2023 15:40 UTC
11 points
0 comments2 min readLW link

What is cal­ibra­tion?

AlexMennen13 Mar 2023 6:30 UTC
27 points
1 comment4 min readLW link

Anki with Uncer­tainty: Turn any flash­card deck into a cal­ibra­tion train­ing tool

Sage Future22 Mar 2023 17:26 UTC
14 points
2 comments1 min readLW link

Pro­posal: Tune LLMs to Use Cal­ibrated Language

OneManyNone7 Jun 2023 21:05 UTC
9 points
0 comments5 min readLW link

The Case for Over­con­fi­dence is Overstated

Kevin Dorst28 Jun 2023 17:21 UTC
50 points
13 comments8 min readLW link
(kevindorst.substack.com)

An­throp­i­cally Blind: the an­thropic shadow is re­flec­tively inconsistent

Christopher King29 Jun 2023 2:36 UTC
40 points
38 comments10 min readLW link

A Sub­tle Selec­tion Effect in Over­con­fi­dence Studies

Kevin Dorst3 Jul 2023 14:43 UTC
24 points
0 comments6 min readLW link
(kevindorst.substack.com)

In­tro­duc­ing Fate­book: the fastest way to make and track predictions

11 Jul 2023 15:28 UTC
127 points
34 comments1 min readLW link
(fatebook.io)

ChatGPT challenges the case for hu­man irrationality

Kevin Dorst22 Aug 2023 12:46 UTC
4 points
10 comments7 min readLW link
(kevindorst.substack.com)

[Question] Are (Mo­tor)sports like F1 a good thing to cal­ibrate es­ti­mates against?

CstineSublime24 Mar 2024 9:07 UTC
4 points
2 comments1 min readLW link

[Question] How to best mea­sure if and to what de­gree you’re too pes­simistic or too op­ti­mistic?

CstineSublime31 Mar 2024 0:57 UTC
4 points
3 comments1 min readLW link