Calibration

TagLast edit: Apr 2, 2025, 9:53 AM by gustaf

Someone is well-calibrated if the things they predict with X% chance of happening in fact occur X% of the time. Importantly, calibration is not the same as accuracy. Calibration is about accurately assessing how good your predictions are, not making good predictions. Person A, whose predictions are marginally better than chance (60% of them come true when choosing from two options) and who is precisely 60% confident in their choices, is perfectly calibrated. In contrast, Person B, who is 99% confident in their predictions, and right 90% of the time, is more accurate than Person A, but less well-calibrated.

Being well-calibrated has value for rationalists separately from accuracy. Among other things, being well-calibrated lets you make good bets / make good decisions, communicate information helpfully to others if they know you to be well-calibrated (See Group Rationality), and helps prioritize which information is worth acquiring.

Note that all expressions of quantified confidence in beliefs can be well- or poorly- calibrated. For example, calibration applies to whether a person’s 95% confidence intervals captures the true outcome 95% of the time.

List of Calibration Exercises
based on this post. Todo: find more & sort & new post for visibility in search engines?

Exercises that are dead/unmaintained

https://www.metaculus.com/tutorials (dead link)
http://web.archive.org/web/20100529074053/http://www.acceleratingfuture.com/tom/?p=129
http://credencecalibration.com (dead link)
https://calibration.lazdini.lv (dead link)
http://web.archive.org/web/20161020032514/http://calibratedprobabilityassessment.org/
https://predictionbook.com/credence_games/try (deprecated; see also this Github Issue)
https://calibration-training.netlify.app (dead link)

List of Probability Calibration Exercises

Isaac KingJan 23, 2022, 2:12 AM

75 points

12 comments1 min readLW link

Information Charts

Rafael HarthNov 13, 2020, 4:12 PM

29 points

6 comments13 min readLW link

Anki with Uncertainty: Turn any flashcard deck into a calibration training tool

Sage FutureMar 22, 2023, 5:26 PM

14 points

2 comments1 min readLW link

(www.quantifiedintuitions.org)

Use Normal Predictions

Jan Christian RefsgaardJan 9, 2022, 3:01 PM

149 points

67 comments6 min readLW link

Concrete benefits of making predictions

Jonny Spicer and Sage Future

Oct 17, 2024, 2:23 PM

36 points

5 comments6 min readLW link

(fatebook.io)

Calibrate your self-assessments

Scott AlexanderOct 9, 2011, 11:26 PM

102 points

122 comments6 min readLW link

The Sin of Underconfidence

Eliezer YudkowskyApr 20, 2009, 6:30 AM

109 points

187 comments6 min readLW link

Calibration Trivia

ScrewtapeAug 4, 2022, 10:31 PM

12 points

9 comments4 min readLW link

I didn’t think I’d take the time to build this calibration training game, but with websim it took roughly 30 seconds, so here it is!

mako yassAug 2, 2024, 10:35 PM

24 points

2 comments5 min readLW link

Hammertime Day 9: Time Calibration

alkjashFeb 7, 2018, 1:40 AM

16 points

11 comments2 min readLW link

(radimentary.wordpress.com)

Giving calibrated time estimates can have social costs

Alex_AltairApr 3, 2022, 9:23 PM

99 points

16 comments5 min readLW link

Introducing Pastcasting: A tool for forecasting practice

Sage FutureAug 11, 2022, 5:38 PM

95 points

10 comments2 min readLW link 2 reviews

Paper: Forecasting world events with neural nets

Owain_Evans, Dan H and Joe Kwon

Jul 1, 2022, 7:40 PM

39 points

3 comments4 min readLW link

Aumann Agreement Game

abramdemskiOct 9, 2015, 5:14 PM

34 points

17 comments1 min readLW link

Credence Calibration Icebreaker Game

RubyAug 14, 2014, 9:01 PM

42 points

1 comment2 min readLW link

Cambridge Prediction Game

NoSignalNoNoiseJan 25, 2020, 3:57 AM

13 points

3 comments2 min readLW link

Simultaneous Overconfidence and Underconfidence

abramdemskiJun 3, 2015, 9:04 PM

37 points

6 comments5 min readLW link

Takeaways from calibration training

Olli JärviniemiJan 29, 2023, 7:09 PM

45 points

2 comments3 min readLW link 1 review

The Bayesian Tyrant

abramdemskiAug 20, 2020, 12:08 AM

143 points

21 comments6 min readLW link 1 review

Paper: Teaching GPT3 to express uncertainty in words

Owain_EvansMay 31, 2022, 1:27 PM

97 points

7 comments4 min readLW link

Suspiciously balanced evidence

gjmFeb 12, 2020, 5:04 PM

50 points

24 comments4 min readLW link

Qualitatively Confused

Eliezer YudkowskyMar 14, 2008, 5:01 PM

71 points

85 comments4 min readLW link

What is calibration?

AlexMennenMar 13, 2023, 6:30 AM

27 points

1 comment4 min readLW link

Calibration Test with database of 150,000+ questions

NanashiMar 14, 2015, 11:22 AM

54 points

32 comments1 min readLW link

How the Equivalent Bet Test Actually Works

Erich_GrunewaldDec 18, 2021, 11:17 AM

4 points

1 comment4 min readLW link

(www.erichgrunewald.com)

A Subtle Selection Effect in Overconfidence Studies

Kevin DorstJul 3, 2023, 2:43 PM

24 points

0 comments6 min readLW link

(kevindorst.substack.com)

We Change Our Minds Less Often Than We Think

Eliezer YudkowskyOct 3, 2007, 6:14 PM

114 points

120 comments1 min readLW link

Do LLMs know what they’re capable of? Why this matters for AI safety, and initial findings

Casey Barkan, Sid Black and Oliver Sourbut

Jul 13, 2025, 7:54 PM

50 points

4 comments18 min readLW link

The Case for Overconfidence is Overstated

Kevin DorstJun 28, 2023, 5:21 PM

50 points

13 comments8 min readLW link

(kevindorst.substack.com)

Prediction Contest 2018

jbeshirApr 30, 2018, 6:26 PM

9 points

4 comments3 min readLW link

Kurzweil’s predictions: good accuracy, poor self-calibration

Stuart_ArmstrongJul 11, 2012, 9:55 AM

50 points

39 comments9 min readLW link

Fair Collective Efficient Altruism

Jobst HeitzigNov 25, 2022, 9:38 AM

2 points

1 comment5 min readLW link

Overconfident Pessimism

lukeprogNov 24, 2012, 12:47 AM

37 points

38 comments4 min readLW link

Placing Yourself as an Instance of a Class

abramdemskiOct 3, 2017, 7:10 PM

36 points

5 comments3 min readLW link

Test Your Calibration!

alyssavanceNov 11, 2009, 10:03 PM

25 points

34 comments2 min readLW link

Advancing Certainty

komponistoJan 18, 2010, 9:51 AM

44 points

110 comments4 min readLW link

ChatGPT challenges the case for human irrationality

Kevin DorstAug 22, 2023, 12:46 PM

3 points

10 comments7 min readLW link

(kevindorst.substack.com)

Picking favourites is hard

dkl9Dec 4, 2024, 8:46 PM

11 points

3 comments1 min readLW link

(dkl9.net)

Prediction Contest 2018: Scores and Retrospective

jbeshirJan 27, 2019, 5:20 PM

28 points

5 comments1 min readLW link

Markets Are Information—Beating the Sportsbooks at Their Own Game

JJXWNov 7, 2024, 8:58 PM

9 points

1 comment2 min readLW link

(thehobbyist.substack.com)

Horrible LHC Inconsistency

Eliezer YudkowskySep 22, 2008, 3:12 AM

34 points

33 comments1 min readLW link

Proposal: Tune LLMs to Use Calibrated Language

OneManyNoneJun 7, 2023, 9:05 PM

9 points

0 comments5 min readLW link

Breaking Rank (Calibration Game)

jennMar 7, 2023, 3:40 PM

11 points

0 comments2 min readLW link

Climate-contingent Finance, and A Generalized Mechanism for X-Risk Reduction Financing

John NaySep 26, 2022, 1:23 PM

0 points

2 comments26 min readLW link

Lawful Uncertainty

Eliezer YudkowskyNov 10, 2008, 9:06 PM

143 points

57 comments4 min readLW link

Say It Loud

Eliezer YudkowskySep 19, 2008, 5:34 PM

62 points

20 comments2 min readLW link

Behavior Cloning is Miscalibrated

leogaoDec 5, 2021, 1:36 AM

77 points

3 comments3 min readLW link

[Question] How to best measure if and to what degree you’re too pessimistic or too optimistic?

CstineSublimeMar 31, 2024, 12:57 AM

4 points

3 comments1 min readLW link

Bayes-Up: An App for Sharing Bayesian-MCQ

Louis FauconFeb 6, 2020, 7:01 PM

53 points

9 comments1 min readLW link

[Question] Is there a.. more exact.. way of scoring a predictor’s calibration?

mako yassJan 16, 2019, 8:19 AM

22 points

6 comments1 min readLW link

Illusion of Transparency: Why No One Understands You

Eliezer YudkowskyOct 20, 2007, 11:49 PM

180 points

52 comments3 min readLW link

RFC on an open problem: how to determine probabilities in the face of social distortion

ialdabaothOct 7, 2017, 10:04 PM

6 points

3 comments2 min readLW link

Social Calibration

SimulatedCrowMay 20, 2021, 11:22 PM

3 points

4 comments4 min readLW link

[Question] Calibration training for ‘percentile rankings’?

david reinsteinSep 14, 2024, 9:51 PM

3 points

0 comments2 min readLW link

[Question] Are (Motor)sports like F1 a good thing to calibrate estimates against?

CstineSublimeMar 24, 2024, 9:07 AM

4 points

2 comments1 min readLW link

Introducing Fatebook: the fastest way to make and track predictions

Adam B and Sage Future

Jul 11, 2023, 3:28 PM

132 points

41 comments1 min readLW link 2 reviews

(fatebook.io)

Calibrate—New Chrome Extension for hiding numbers so you can guess

chanamessingerOct 7, 2022, 11:21 AM

59 points

16 comments1 min readLW link

(chrome.google.com)

A Motorcycle (and Calibration?) Accident

bogglerMar 18, 2018, 10:21 PM

25 points

11 comments2 min readLW link

[Question] Has Someone Checked The Cold-Water-In-Left-Ear Thing?

MaloewDec 28, 2024, 8:15 PM

11 points

0 comments1 min readLW link

Calibration for continuous quantities

CyanNov 21, 2009, 4:53 AM

30 points

13 comments3 min readLW link

Anthropically Blind: the anthropic shadow is reflectively inconsistent

Christopher KingJun 29, 2023, 2:36 AM

43 points

40 comments10 min readLW link

Raising the forecasting waterline (part 1)

MorendilOct 9, 2012, 3:49 PM

51 points

107 comments6 min readLW link

Outrangeous (Calibration Game)

jennMar 7, 2023, 3:29 PM

38 points

3 comments9 min readLW link

Prediction and Calibration—Part 1

Jan Christian RefsgaardMay 8, 2021, 7:48 PM

6 points

10 comments4 min readLW link

(www.badprior.com)

[Question] What are good ML/AI related prediction / calibration questions for 2019?

james_tJan 4, 2019, 2:40 AM

19 points

4 comments2 min readLW link

Quantified Intuitions: An epistemics training website including a new EA-themed calibration app

Sage Future and elifland

Sep 20, 2022, 10:25 PM

28 points

2 comments2 min readLW link

Why I’m Pouring Cold Water in My Left Ear, and You Should Too

MaloewJan 24, 2025, 11:13 PM

12 points

0 comments2 min readLW link

How to reach 80% of your goals. Exactly 80%.

Bart BussmannOct 10, 2020, 5:33 PM

36 points

11 comments1 min readLW link

the gears to ascension Sep 12, 2023, 10:50 AM
2 points
0
@Jim Fisher what’s your reasoning for removing the archive.org links?
- Jim Fisher Sep 25, 2023, 6:44 PM
  1 point
  0
  Parent
  I don’t think I removed any archive.org links. My intention was to remove dead links, and to make it clearer which links are worth visiting. Please revert if I’ve made a mistake or you disagree with my intention.