Some thoughts on meta-probabilties

I often like to think of my epistemic probability assignments in terms of probabilities-of-probabilities, or meta-probabilities. In other words, what probability would I assign that my probability estimate is accurate? Am I very confident, am I only mildly confident, or do I only have a vague clue?

I often think of it as a sort of bell curve, with the x-axis being possible probability estimates and the y-axis being my confidence in those estimates. So if I have very low confidence in my estimate then the height of the bell will be very low, and if I have high confidence it’ll be pretty high.

Here are a few issues and insights that have come up when discussing or thinking about this:

What would a meta-probability actually mean?

There’s two ways I have for thinking about it:

1) The meta-probability is my prediction for how likely I am to change my mind (and to what extent) as I learn more information about the topic.

2) I know that I’m not even close to being an ideal Bayesian agent, and that my best shots at a probability estimate are fuzzy, imprecise, and likely mistaken anyway. The meta-probability is my prediction for what an ideal Bayesian agent would assign as the probability for the question at hand.

What’s the point?

Primarily it’s just useful for conveying how sure I am of the probability estimate I’m assigning. It’s a way of conveying that a coin flip is 50% heads in a very different sense than me saying “I have not the slightest clue whether it’ll rain tomorrow on the other side of the world, and if I need to bet on it I’d give it ~50% odds”. I’ve seen other people convey related sentiments by saying things like, “well 90% is probably too low an estimate, and 99% is probably too high, so somewhere between those”. I’d just view the 90% and 99% figures as maybe 95% confidence bounds on a bell curve.

Why not keep going and say how confident you are about your confidence estimates?

True, I could do this, and I sometimes will do this if needed by visualizing a bit of fuzziness in my bell curve. But in any case it’s usually enough for my purposes.

Is there any use for such a view in terms of instrumental or utilitarian calculations?

Not sure. I’ve seen some relevant discussion by Scott Alexander and Holden Karnofsky, but I’m not sure I followed everything there. I also suspect that if you view it as a prediction of how your views might change if you learned more about the subject, then this might imply that it’s useful in deciding how much time to invest in further research.

Thoughts?

[Note 1: I discussed this topic about a year ago on LessWrong, and got some insightful responses then. Some commenters disagreed with me then and I’ll predict that they’ll do so again here—I’d give it, oh, say an 80% chance, moderate confidence ;).]

[Note 2: If you could try to avoid complicated math in your responses that would be appreciated. I’m still on the precalculus level here.]

[Note 3: As I finished writing this I dug up some interesting LessWrong posts on the subject, with links to yet more relevant posts.]