The above analysis is what immediately occurred to me on reading the OP. Yet the supposed paradox, dating from 2017 (the first version of the paper cited in the OP), seems to have genuinely perplexed the aerospace community.
A control process cannot control a variable to better than the accuracy with which it can measure it. It is pointless to try to avoid two satellites coming within 10 metres of each other, if your tracking process cannot measure their positions better than to 100 metres (the green trace in my figure). If your tracking process cannot be improved, then you must content yourself with avoiding approaches within around 100 metres, and you will be on the equivalent of the yellow line in that figure. The great majority of the evasive actions that your system will employ will be unnecessary to avert actual collisions, which only actually happen for much closer approaches. That is just the price of having poor data.
I went looking on Google Scholar for the origins and descendants of this “false confidence” concept, and it’s part of a whole non-Bayesian paradigm of belief as something not to be quantified by probability. This is a subject that has received little attention on LessWrong, I guess because Eliezer thinks it’s a wrong turning, like e.g. religion, and wrote it off long ago as not worth taking further notice of. The most substantial allusion to it here that I’ve found is in footnote 1 to this posting.
Are there any members of the “belief function community” here, or “imprecise probabilists”, who believe that “precise probability theory is not the only mode of uncertainty quantification”? Not scare quotes, but taken from Ryan Martin, “Which statistical hypotheses are afflicted with
false confidence?”. How would they respond to my suggestion that “false confidence” is not a problem and that belief-as-probability is enough to deal with satellite collision avoidance?
It is pointless to try to avoid two satellites coming within 10 metres of each other, if your tracking process cannot measure their positions better than to 100 metres (the green trace in my figure).
This seems straightforwardly false. If you can keep them from approaching within 100 metres of each other, then that necessarily also keeps them from approaching within 10 metres of each other.
It does, but very wastefully. Almost all of the avoidance manoeuvres you make will be unnecessary, and some will even cause a collision, but you will not know which ones. Further modelling (that I think would be belabouring the point to do) would allow a plot of how a decision rule for manoeuvring reduces the probability of collisions.
Let fm be the frequency of collisions given the tracking precision and some rule for manoeuvres. Let f0 be the frequency of collisions without manoeuvres. Define effectiveness to be 1−fmf0.
I would expect effectiveness to approach 1 for perfect tracking (and a sensible decision rule) and decline towards 0 as the precision gets worse.
Are you envisaging a system with 100m tracking resolution that aims to make satellites miss by exactly 10m if they appear to be on a collision course? Sure, some of those maneuvers will cause collisions. Which is why you make them all miss by 100m (or more as a safety margin) instead. This ensures, as a side effect, that they also avoid coming within 10m of each other.
The “paradox” here is that when one person says there’s a 70% chance that the satellites are safe, and another says there’s a 99.9% chance that they’re safe, it sounds like the second person must be much more certain about what’s going on up there. But in this case, the opposite is true.
When someone says “there’s a 99.9% chance that the satellites won’t collide,” we naturally imagine that this statement is being generated by a process that looks like “I performed a high-precision measurement of the closest approach distance, my central estimate is that there won’t be a collision, and the case where there is a collision is off in the wings of my measurement error such that it has a lingering 0.1% chance.” But the same probability estimate can be generated by a very low-precision measurement with a central estimate that there will be a collision. The former case is cause to relax; the latter is not. Yeah, in a sense this is obvious. But it’s a reminder that seeing a probability estimate isn’t a substitute for real diligence.
Right, exactly. But this isn’t only about satellite tracking. A lot of the time you don’t have the luxury of comparing the high-precision estimate to the low-precision estimate. You’re only talking to the second guy, and it’s important not to take his apparent confidence at face value. Maybe this is obvious to you, but a lot of the content on this site is about explicating common errors of logic and statistics that people might fall for. I think it’s valuable.
In the satellite tracking example, the thing to do is exactly as you say: whatever the error bars on your measurements, treat that as the effective size of the satellite. If you can only resolve positions to within 100 meters, then any approach within 100 meters counts as a “collision.”
I’m also curious about the “likelihood-based sampling distribution framework” mentioned in the cited arXiv paper. The paper claims that “this alternative interpretation is not problematic,” but it seems like its interpretation of the satellite example is substantially identical to the Bayesian interpretation. The lesson to draw from the false confidence theorem is “be careful,” not “abandon all the laws of ordinary statistics in favor of an alternative conception of uncertainty.”
Maybe this is obvious to you, but a lot of the content on this site is about explicating common errors of logic and statistics that people might fall for. I think it’s valuable.
Thank you. Maybe I over-indexed on using the satellite example, but I thought it made for a better didactic example in part because it was so obvious. I provided the other examples to point to cases where I thought the error was less clear.
The lesson to draw from the false confidence theorem is “be careful,” not “abandon all the laws of ordinary statistics in favor of an alternative conception of uncertainty.”
This is also true. Like I said (maybe not very clearly), there’s more or less 2 solutions—use non-epistemtic belief to represent uncertainty, or avoid using epistemic uncertainty in probability calculations. (And you might even be able to sort of squeeze the former solution into Bayesian representation by always including “something I haven’t thought of” to include some of your probability mass, which I think is something Eliezer has even suggested. I haven’t thought about this part in detail.)
I didn’t look for, and so was not aware of, any larger community. I found the 2 linked papers and, once I realized what was going on, recognized the apparent error in a few places. I agree that “decreasing the quality of your data should not make you more confident” is obvious when stated that way, but like with many “obvious” insights, the problem often comes in recognizing it when it comes up. I attempted to point this out to Micheal Weissman in one of the ACX threads (he did a Bayesian analysis of lab leak, similar to Rootclaim’s) and he repeatedly defended arguments of this form even after I pointed out that he was getting reasonably large Bayes Factors based entirely on epistemic uncertainty.
Did you read section 2c of the paper? It seems to be saying something very similar to the point you made about the tracking uncertainty:
for a fixed S/R [relative uncertainty of the closest approach distance] ratio, there is a maximum computable epistemic probability of collision. Whether or not the two satellites are on a collision course, no matter what the data indicate, the analyst will have a minimum confidence that the two satellites will not collide. That minimum confidence is determined purely by the data quality… For example, if the uncertainty in the distance between two satellites at closest approach is ten times the combined size of the two satellites, the analyst will always compute at least a 99.5% confidence that the satellites are safe, even if, in reality, they are not…
So when you say
then you must content yourself with avoiding approaches within around 100 metres, and you will be on the equivalent of the yellow line in that figure.
Is this not essentially what the confidence region approach is doing?
Some further remarks.
The above analysis is what immediately occurred to me on reading the OP. Yet the supposed paradox, dating from 2017 (the first version of the paper cited in the OP), seems to have genuinely perplexed the aerospace community.
A control process cannot control a variable to better than the accuracy with which it can measure it. It is pointless to try to avoid two satellites coming within 10 metres of each other, if your tracking process cannot measure their positions better than to 100 metres (the green trace in my figure). If your tracking process cannot be improved, then you must content yourself with avoiding approaches within around 100 metres, and you will be on the equivalent of the yellow line in that figure. The great majority of the evasive actions that your system will employ will be unnecessary to avert actual collisions, which only actually happen for much closer approaches. That is just the price of having poor data.
I went looking on Google Scholar for the origins and descendants of this “false confidence” concept, and it’s part of a whole non-Bayesian paradigm of belief as something not to be quantified by probability. This is a subject that has received little attention on LessWrong, I guess because Eliezer thinks it’s a wrong turning, like e.g. religion, and wrote it off long ago as not worth taking further notice of. The most substantial allusion to it here that I’ve found is in footnote 1 to this posting.
Are there any members of the “belief function community” here, or “imprecise probabilists”, who believe that “precise probability theory is not the only mode of uncertainty quantification”? Not scare quotes, but taken from Ryan Martin, “Which statistical hypotheses are afflicted with false confidence?”. How would they respond to my suggestion that “false confidence” is not a problem and that belief-as-probability is enough to deal with satellite collision avoidance?
This seems straightforwardly false. If you can keep them from approaching within 100 metres of each other, then that necessarily also keeps them from approaching within 10 metres of each other.
It does, but very wastefully. Almost all of the avoidance manoeuvres you make will be unnecessary, and some will even cause a collision, but you will not know which ones. Further modelling (that I think would be belabouring the point to do) would allow a plot of how a decision rule for manoeuvring reduces the probability of collisions.
Let fm be the frequency of collisions given the tracking precision and some rule for manoeuvres. Let f0 be the frequency of collisions without manoeuvres. Define effectiveness to be 1−fmf0.
I would expect effectiveness to approach 1 for perfect tracking (and a sensible decision rule) and decline towards 0 as the precision gets worse.
Are you envisaging a system with 100m tracking resolution that aims to make satellites miss by exactly 10m if they appear to be on a collision course? Sure, some of those maneuvers will cause collisions. Which is why you make them all miss by 100m (or more as a safety margin) instead. This ensures, as a side effect, that they also avoid coming within 10m of each other.
I think this is essentially the solution mentioned in the paper.
The “paradox” here is that when one person says there’s a 70% chance that the satellites are safe, and another says there’s a 99.9% chance that they’re safe, it sounds like the second person must be much more certain about what’s going on up there. But in this case, the opposite is true.
When someone says “there’s a 99.9% chance that the satellites won’t collide,” we naturally imagine that this statement is being generated by a process that looks like “I performed a high-precision measurement of the closest approach distance, my central estimate is that there won’t be a collision, and the case where there is a collision is off in the wings of my measurement error such that it has a lingering 0.1% chance.” But the same probability estimate can be generated by a very low-precision measurement with a central estimate that there will be a collision. The former case is cause to relax; the latter is not. Yeah, in a sense this is obvious. But it’s a reminder that seeing a probability estimate isn’t a substitute for real diligence.
I imagine the conversation in the control room where they’re tracking the satellites and deciding whether to have one of them make a burn:
“What’s the problem? 99.9% chance they’re safe!”
“We’re looking at 70%.” [Gestures at all the equipment receiving data and plotting projected paths.] “Where did you pull 99.9% from?”
“Well, how often does a given pair of satellites collide? Pretty much never, right? Outside view, man, outside view!”
“You’re fired. Get out of the room and leave this to the people who have a clue.”
Right, exactly. But this isn’t only about satellite tracking. A lot of the time you don’t have the luxury of comparing the high-precision estimate to the low-precision estimate. You’re only talking to the second guy, and it’s important not to take his apparent confidence at face value. Maybe this is obvious to you, but a lot of the content on this site is about explicating common errors of logic and statistics that people might fall for. I think it’s valuable.
In the satellite tracking example, the thing to do is exactly as you say: whatever the error bars on your measurements, treat that as the effective size of the satellite. If you can only resolve positions to within 100 meters, then any approach within 100 meters counts as a “collision.”
I’m also curious about the “likelihood-based sampling distribution framework” mentioned in the cited arXiv paper. The paper claims that “this alternative interpretation is not problematic,” but it seems like its interpretation of the satellite example is substantially identical to the Bayesian interpretation. The lesson to draw from the false confidence theorem is “be careful,” not “abandon all the laws of ordinary statistics in favor of an alternative conception of uncertainty.”
Thank you. Maybe I over-indexed on using the satellite example, but I thought it made for a better didactic example in part because it was so obvious. I provided the other examples to point to cases where I thought the error was less clear.
This is also true. Like I said (maybe not very clearly), there’s more or less 2 solutions—use non-epistemtic belief to represent uncertainty, or avoid using epistemic uncertainty in probability calculations. (And you might even be able to sort of squeeze the former solution into Bayesian representation by always including “something I haven’t thought of” to include some of your probability mass, which I think is something Eliezer has even suggested. I haven’t thought about this part in detail.)
I didn’t look for, and so was not aware of, any larger community. I found the 2 linked papers and, once I realized what was going on, recognized the apparent error in a few places. I agree that “decreasing the quality of your data should not make you more confident” is obvious when stated that way, but like with many “obvious” insights, the problem often comes in recognizing it when it comes up. I attempted to point this out to Micheal Weissman in one of the ACX threads (he did a Bayesian analysis of lab leak, similar to Rootclaim’s) and he repeatedly defended arguments of this form even after I pointed out that he was getting reasonably large Bayes Factors based entirely on epistemic uncertainty.
Did you read section 2c of the paper? It seems to be saying something very similar to the point you made about the tracking uncertainty:
So when you say
Is this not essentially what the confidence region approach is doing?