Agree. Having done some calibration exercises, I found that it’s actually pretty hard to give a likelihood of less than 1% or greater than 99% because it’s within the rate at which I make stupid mistakes, like clicking the wrong button, writing down the wrong thing, misreading the question, etc.. It’s tempting to think you can be more certain that that by coming up with silly examples that are clearly impossible, but actually it’s pretty likely you’ll accidentally come up with a silly example that disproves your point more often than your silly example will be true/false, so our subjective probabilities end up dominated by our own practical limitations at the limits.
I think there is (sometimes) value in distinguishing two separate probabilities for any given thing. There’s the “naïve” probability that you estimate while ignoring the possibility that you’ve blundered, that you misread something critical, that some underlying assumption of yours is wrong in a way that never crossed your mind, etc. And then there’s the “pessimistic” probability that tries to account for those things.
You want these to be separate because if you’re doing a calculation using the various probabilities, sometimes it’s better to do all the calculations using “naïve” probabilities and then do a final correction at the end for blunders, wrong fundamental assumptions, etc.
… Maybe. It depends on what the calculation is, what sort of out-of-model errors there might be, etc.
Of course this is a rough heuristic. I think what it’s an approximation to is a more careful tracking of lots of conditional probabilities (people around here sometimes talk as if being a Bayesian means assigning probabilities to things, but it would be more precise to say that being a Bayesian means assigning conditional probabilities to things, and a lot of the information is in that extra structure). E.g., suppose there are 100 things, each of which you give naïve probability 10^-9 to, but there’s a 10^-3 chance that some fundamental error in your model makes them actually happen 1⁄10 of the time. Then your “adjusted” probability for each one is about 10^-4, and if you use those to estimate the probability that at least one happens you get about 10^-2; but in this situation—assuming that the “fundamental error in your model” is actually the only substantial cause of out-of-model errors—that probability should actually be more like 10^-4. Of course, if you make a calculation like that then sometimes there’s a fundamental error in your model of where the possible errors come from :-).
Hmm, my point though is that you’re mistaken if you think you can separate these two, because you’re the embedded agent making both predictions, so your naive prediction isn’t actually independent of you the faliable being making predictions.
I’d compare this to the concept of significant digits in science. Like, yeah, you can get highly accurate measurements, but as soon as you stick them in calculations they get eaten up by the error in other measurements. I’m claiming the same thing happens here for humans: beyond a certain point our predictions are dominated by our own errors. Maybe my particular numbers are not representative of all scenarios, but I think the point stands regardless, you just have to dial in the numbers to match reality.
I completely agree that beyond a certain point our predictions are dominated by our own errors, but I’m not sure that that’s always well modelled by just moving all probability estimates that are close to 0 or 1 away by (say) 10^-3.
Example: Pascal’s mugging. (This is an example where just moving everything away from 0 or 1 is probably a bad idea, but to be clear I think it isn’t an example where it would help much to separate out your “in-model” and “out-of-model” errors.) Someone comes to you and says: I am a god/wizard/simulation-operator and can do such-and-such things which produce/destroy incredibly large amounts of utility; pay me $1000 and I’ll do that in your favour rather than against you. You say: haha, no, my estimate of the probability that you can swing 3^^^3 utils is less than 1/3^^^3, so go jump in a lake.
In this situation, if instead you say “gosh, I could be wrong in all sorts of ways, so I’d better revise that probability estimate to say 10^-6” and then go ahead and do your expected-utility calculation, then you pay the mugger every time. Even after they say “behold, now I shall create you a mountain of gold just to prove I can” and nothing happens and they say “ah, well, I’m just testing your faith; would you like to give me another $1000 now?”.
Perhaps the right way to handle this is to say that utility 1/epsilon is no better than probability epsilon, embrace scope insensitivity, and pretend that they were only offering/threatening 10^6 utils and your probability is only 10^-6, or something like that. And maybe that says that when someone makes such a claim you should give them a dollar if that’s what they ask for, and see whether they deliver.
I am not at all confident that that’s really a good approach, but if you do handle it that way then you need to be able to reason that after you give them a dollar and they fail, you shouldn’t give them another dollar because however improbable their claim was before, it’s 100x more improbable now. You can’t do that if you just mechanically turn all very small probabilities into 10^-6 or whatever.
I don’t have a clearly-satisfactory approach to offer instead. But I think this sort of example demonstrates that sometimes you need to do something more sophisticated than pushing all tiny probabilities away from zero.
I guess an instrumental approach I’ve been advocating on this site for a long time is to estimate the noise level, call it “practically zero” and treat anything at that level as such. For example, in the Pascal’s mugger case, there are so many disjunctive possibilities with higher odds to hear the same story vs the story as told being true, that there is no reason to privilege believing in what you hear over all higher-probability options, including dreaming, hallucinating, con, psych experiment, candid camera… It’s not about accurately estimating EV and so becoming susceptible to blackmail, it’s about rejecting anything at the noise level. Which, I guess, is another way to say “epsilon”, not technically zero, but as good as.
You can at least estimate some lower bounds on self-error, even if you can’t necessarily be certain of upper ones. That’s better than nothing, which is what you get if you don’t separate the probabilities.
For example my performance in test questions where I know the subject backwards and forwards isn’t 100%, because sometimes I misread the question, or have a brain fart while working out answers, and so on. On the other hand, most of these are localized errors. Given extra time, opportunity to check references, consult with other people, and so on, I can reduce these sorts of errors a great deal.
Weird, I’m totally in the other boat. I think we can use sub-1% or super-99% probabilities easily, all the time.
I just went on a long road trip. What probability should I have used that my car springs a brake fluid leak slow enough that it’s going to be useful for me to have a can of brake fluid in the car? I’d guess it happens once every 250k miles or so, and I just drove about 1k, so that’s about 1 in 250 (or let’s say 1 in 500 to guesstimate at the effect of doing highway driving). Bam, sub-1% probability. Now, did I need to consciously evaluate the probabilities to decide that I should definitely bring engine oil, might as well bring brake fluid but it’s not super important, and not need to bring a bicycle pump? No. But it you ask me, I don’t see what’s stopping me from giving totally reasonable probabilities for needing these things.
I think there’s a perspective that can synthesize both of these observations.
I could easily write a list of predictions of which less than 1 in 10,000 would be false:
The Sun will be shining somewhere on Earth at 2022-02-19 18:34:25.00001 UTC
The Sun will be shining somewhere on Earth at 2022-02-19 18:34:25.00002 UTC
The Sun will be shining somewhere on Earth at 2022-02-19 18:34:25.00003 UTC
etc...
Of course, I’m “cheating”. There seem to be less than 100 consciously distinct plausibility values for me (or probably anyone). What I actually believe in this situation are several facts about how the Sun, Earth, time, and shining work which I believe at the highest plausibility value I can distinguish/track (something like >99.5%). I’m able to logically synthesize these into the above class of statements, from which I can deduce that the implied probability of those statements is quite high (much more than 99.5% likely to hold). This is an important part of what makes abstraction so powerful.
If you asked me for 10,000 true statements of which I could not explicitly logically connect any of them, I would be surprised if more than 99.5% of them were actually true, even putting my highest possible level of care and effort into it. I think this is an inherent limitation of how my mind works: there just isn’t a distinguishable plausibility value that I can use to distinguish these (which is an inherent limitation of being a bounded agent).
The key, I think, is that there is an important sense in which we can be more certain of logical deductions than intuitive beliefs, notwithstanding the fact that we are prone to making logical errors (for example, I used redundant lines of reasoning and large margins for error to generate the above example). It’s easy to be overconfident, but it’s almost as easy to be too pessimistic about what we can know.
That depends on the subject. People are not as fallible on life or death subjects or more people would accidentally walk in front of cars, trains or fall of high places. Anybody have an idea what that probability would be?
Agree. Having done some calibration exercises, I found that it’s actually pretty hard to give a likelihood of less than 1% or greater than 99% because it’s within the rate at which I make stupid mistakes, like clicking the wrong button, writing down the wrong thing, misreading the question, etc.. It’s tempting to think you can be more certain that that by coming up with silly examples that are clearly impossible, but actually it’s pretty likely you’ll accidentally come up with a silly example that disproves your point more often than your silly example will be true/false, so our subjective probabilities end up dominated by our own practical limitations at the limits.
I think there is (sometimes) value in distinguishing two separate probabilities for any given thing. There’s the “naïve” probability that you estimate while ignoring the possibility that you’ve blundered, that you misread something critical, that some underlying assumption of yours is wrong in a way that never crossed your mind, etc. And then there’s the “pessimistic” probability that tries to account for those things.
You want these to be separate because if you’re doing a calculation using the various probabilities, sometimes it’s better to do all the calculations using “naïve” probabilities and then do a final correction at the end for blunders, wrong fundamental assumptions, etc.
… Maybe. It depends on what the calculation is, what sort of out-of-model errors there might be, etc.
Of course this is a rough heuristic. I think what it’s an approximation to is a more careful tracking of lots of conditional probabilities (people around here sometimes talk as if being a Bayesian means assigning probabilities to things, but it would be more precise to say that being a Bayesian means assigning conditional probabilities to things, and a lot of the information is in that extra structure). E.g., suppose there are 100 things, each of which you give naïve probability 10^-9 to, but there’s a 10^-3 chance that some fundamental error in your model makes them actually happen 1⁄10 of the time. Then your “adjusted” probability for each one is about 10^-4, and if you use those to estimate the probability that at least one happens you get about 10^-2; but in this situation—assuming that the “fundamental error in your model” is actually the only substantial cause of out-of-model errors—that probability should actually be more like 10^-4. Of course, if you make a calculation like that then sometimes there’s a fundamental error in your model of where the possible errors come from :-).
Hmm, my point though is that you’re mistaken if you think you can separate these two, because you’re the embedded agent making both predictions, so your naive prediction isn’t actually independent of you the faliable being making predictions.
I’d compare this to the concept of significant digits in science. Like, yeah, you can get highly accurate measurements, but as soon as you stick them in calculations they get eaten up by the error in other measurements. I’m claiming the same thing happens here for humans: beyond a certain point our predictions are dominated by our own errors. Maybe my particular numbers are not representative of all scenarios, but I think the point stands regardless, you just have to dial in the numbers to match reality.
I completely agree that beyond a certain point our predictions are dominated by our own errors, but I’m not sure that that’s always well modelled by just moving all probability estimates that are close to 0 or 1 away by (say) 10^-3.
Example: Pascal’s mugging. (This is an example where just moving everything away from 0 or 1 is probably a bad idea, but to be clear I think it isn’t an example where it would help much to separate out your “in-model” and “out-of-model” errors.) Someone comes to you and says: I am a god/wizard/simulation-operator and can do such-and-such things which produce/destroy incredibly large amounts of utility; pay me $1000 and I’ll do that in your favour rather than against you. You say: haha, no, my estimate of the probability that you can swing 3^^^3 utils is less than 1/3^^^3, so go jump in a lake.
In this situation, if instead you say “gosh, I could be wrong in all sorts of ways, so I’d better revise that probability estimate to say 10^-6” and then go ahead and do your expected-utility calculation, then you pay the mugger every time. Even after they say “behold, now I shall create you a mountain of gold just to prove I can” and nothing happens and they say “ah, well, I’m just testing your faith; would you like to give me another $1000 now?”.
Perhaps the right way to handle this is to say that utility 1/epsilon is no better than probability epsilon, embrace scope insensitivity, and pretend that they were only offering/threatening 10^6 utils and your probability is only 10^-6, or something like that. And maybe that says that when someone makes such a claim you should give them a dollar if that’s what they ask for, and see whether they deliver.
I am not at all confident that that’s really a good approach, but if you do handle it that way then you need to be able to reason that after you give them a dollar and they fail, you shouldn’t give them another dollar because however improbable their claim was before, it’s 100x more improbable now. You can’t do that if you just mechanically turn all very small probabilities into 10^-6 or whatever.
I don’t have a clearly-satisfactory approach to offer instead. But I think this sort of example demonstrates that sometimes you need to do something more sophisticated than pushing all tiny probabilities away from zero.
I guess an instrumental approach I’ve been advocating on this site for a long time is to estimate the noise level, call it “practically zero” and treat anything at that level as such. For example, in the Pascal’s mugger case, there are so many disjunctive possibilities with higher odds to hear the same story vs the story as told being true, that there is no reason to privilege believing in what you hear over all higher-probability options, including dreaming, hallucinating, con, psych experiment, candid camera… It’s not about accurately estimating EV and so becoming susceptible to blackmail, it’s about rejecting anything at the noise level. Which, I guess, is another way to say “epsilon”, not technically zero, but as good as.
You can at least estimate some lower bounds on self-error, even if you can’t necessarily be certain of upper ones. That’s better than nothing, which is what you get if you don’t separate the probabilities.
For example my performance in test questions where I know the subject backwards and forwards isn’t 100%, because sometimes I misread the question, or have a brain fart while working out answers, and so on. On the other hand, most of these are localized errors. Given extra time, opportunity to check references, consult with other people, and so on, I can reduce these sorts of errors a great deal.
There is value in knowing this.
Weird, I’m totally in the other boat. I think we can use sub-1% or super-99% probabilities easily, all the time.
I just went on a long road trip. What probability should I have used that my car springs a brake fluid leak slow enough that it’s going to be useful for me to have a can of brake fluid in the car? I’d guess it happens once every 250k miles or so, and I just drove about 1k, so that’s about 1 in 250 (or let’s say 1 in 500 to guesstimate at the effect of doing highway driving). Bam, sub-1% probability. Now, did I need to consciously evaluate the probabilities to decide that I should definitely bring engine oil, might as well bring brake fluid but it’s not super important, and not need to bring a bicycle pump? No. But it you ask me, I don’t see what’s stopping me from giving totally reasonable probabilities for needing these things.
I think there’s a perspective that can synthesize both of these observations.
I could easily write a list of predictions of which less than 1 in 10,000 would be false:
The Sun will be shining somewhere on Earth at 2022-02-19 18:34:25.00001 UTC
The Sun will be shining somewhere on Earth at 2022-02-19 18:34:25.00002 UTC
The Sun will be shining somewhere on Earth at 2022-02-19 18:34:25.00003 UTC etc...
Of course, I’m “cheating”. There seem to be less than 100 consciously distinct plausibility values for me (or probably anyone). What I actually believe in this situation are several facts about how the Sun, Earth, time, and shining work which I believe at the highest plausibility value I can distinguish/track (something like >99.5%). I’m able to logically synthesize these into the above class of statements, from which I can deduce that the implied probability of those statements is quite high (much more than 99.5% likely to hold). This is an important part of what makes abstraction so powerful.
If you asked me for 10,000 true statements of which I could not explicitly logically connect any of them, I would be surprised if more than 99.5% of them were actually true, even putting my highest possible level of care and effort into it. I think this is an inherent limitation of how my mind works: there just isn’t a distinguishable plausibility value that I can use to distinguish these (which is an inherent limitation of being a bounded agent).
The key, I think, is that there is an important sense in which we can be more certain of logical deductions than intuitive beliefs, notwithstanding the fact that we are prone to making logical errors (for example, I used redundant lines of reasoning and large margins for error to generate the above example). It’s easy to be overconfident, but it’s almost as easy to be too pessimistic about what we can know.
Yeah, being inaccurate about personal fallibility levels is something I was trying to gesture at in https://www.lesswrong.com/posts/3duptyaLKKJxcnRKA/you-are-way-more-fallible-than-you-think
and I think your comment summarizes what I wanted to express.
That depends on the subject. People are not as fallible on life or death subjects or more people would accidentally walk in front of cars, trains or fall of high places. Anybody have an idea what that probability would be?
People are pretty fallible in these cases, too, literally. Look at the number of fatalities in canyons with well marked paths and fences.
For each encounter with a precipice, I would strongly guess the success probably is > 99.999.