Ok. I think part of the issue [ETA: with our mutual understanding of each other, not with you] is that you’re focused on the “You’re lying” part of the conversation.
I’m considering it in the context of this: “My observations are always fallible, and if you make an event improbable enough, why shouldn’t I be skeptical even if I think I observed it?”
Granted, his observations have N bits of information (at least), the same as the situation with cheating, and it’s at least as improbable that he’d observe a given sequence of length N when something else entirely happened, than that the given sequence of length N itself happened, so in practice, it’s still -certainly- more likely that he actually observed the observation he observed.
The paradox isn’t there. The paradox is that we would, in fact, find some sequences unbelievable, even though they’re exactly as likely as every other sequence. If the sequence was all heads 100 times in a row, for instance, that would be unbelievable, even though a sequence of pure heads is exactly as likely as any other sequence.
The paradox is in the fact that the sequence is undefined, and for some sequences, we’d be inclined to side with Alice, and for other sequences, we’d be inclined to side with Bob, even though all possible sequences of the same length are equally likely.
ETA:
This is what I was getting at with the difference between the reference classes of “distinguished” and “undistinguished”.
if you make an event improbable enough, why shouldn’t I be skeptical even if I think I observed it?
You should. You should be aware that you might e.g. have made a mistake and slightly misremembered (or miscopied, etc.) the results of the coin flips, for instance.
we would, in fact, find some sequences unbelievable
We might say that. We might even think it. But what we ought to mean is that we find other explanations more plausible than chance in those cases. If you flip a coin 100 times and get random-looking results: sure, those particular results are very improbable, but very improbable things happen all the time (as in fact you can demonstrate by flipping a coin 100 times). What you should generally be looking at is not probabilities but odds. That random-looking sequence is neither much more nor much less likely than any other random-looking sequence of 100 coin-flips, so the fact that it’s improbable doesn’t give you reason to disbelieve it—you don’t have a better rival hypothesis. But if you flip all heads, suddenly there are higher-probability alternatives. Not because all-heads is especially unlikely by chance, but because it’s especially likely by not-chance. Maybe the coin is double-headed. Maybe it’s weighted in some clever way[1]. Maybe you’re hallucinating or dreaming. Maybe some god is having a laugh. All these things are (so at least it seems) much more likely to produce all-heads than a random-looking sequence.
[1] I think I recall seeing an analysis somewhere that found that actually weighting a coin can’t bias its results much.
But if you flip all heads, suddenly there are higher-probability alternatives. Not because all-heads is especially unlikely by chance, but because it’s especially likely by not-chance. Maybe the coin is double-headed. Maybe it’s weighted in some clever way[1]. Maybe you’re hallucinating or dreaming. Maybe some god is having a laugh. All these things are (so at least it seems) much more likely to produce all-heads than a random-looking sequence.
Which is, I think, what is interesting about this: All-heads is no more improbable than any other random sequence, but in the case of an all-heads sequence, suddenly we start looking for laughing gods, hallucinations, or dreams as an explanation.
Which is to say, the interesting thing here is that we’d start looking for explanations of an all-heads sequence, even though it’s no more improbable than any other sequence.
You didn’t, and wouldn’t, leap into the better explanations. You leapt fully into any explanation except chance, without regard for whether or not it was a better explanation.
Gods having a laugh aren’t something you even think of if you aren’t looking for an explanation.
Gods having a laugh are a pretty terrible explanation for anything, and their inclusion here was mostly gjm having a laugh.
The borderline between “suddenly we start looking for a better explanation” and “suddenly better explanations start occurring to us” is an extremely fuzzy one. My reason for preferring the latter framing is that what’s changed isn’t that randomness has become worse at explaining our observations, but that some non-random explanation has got better.
My apologies for being dim: what are “one” and “the other” here?
Suddenly looking for explanations, versus explanations suddenly begin occurring to us.
How do you know why I did it? (I say: you don’t know why I did it, you’re just pretending to. That’s as rude as it is foolish.)
Because of how humor works. It depends upon a shared/common experience. You not only expect to think of gods laughing at you, in that situation—because you’ve thought of exactly that in similar weird circumstances in your life—you expect me to think of gods laughing at me, in that situation. (And gods laughing at me would, in fact, be something I considered given a long-enough sequence of all-Heads, so the joke didn’t fall flat. I’ve thought of some equivalent of gods laughing at me for far less unusual coincidences, after all.)
I didn’t need you to tell me it was a joke, however. I knew that explanation would occur to you in the real world before you ever mentioned it—because 100 heads in a row would be, quite simply, unbelievable, and any sane person would be questioning -everything- in lieu of believing it happened by chance—even though any other random sequence is just as unlikely. It’s just how our brains work.
Suddenly looking for explanations, versus explanations suddenly begin occurring to us.
OK, that’s what I first thought. But then I can’t make sense of what you say about these: “One is a very good mathematical explanation” and “the other is why ‘Gods having a laugh’ would actually cross your mind”. From the “actually” in the second, it seems as if you’re endorsing that one, in which case presumably “a very good mathematical explanation” is intended as a criticism. Which doesn’t make any sense to me.
How do you know why I did it?
Because of how humor works.
But your analysis on the basis of “how humor works” doesn’t give any reason at all for any preference between “suddenly start looking for explanations” and “explanations start occurring to us”. It hinges only on the fact that, one way or another, many people in such a weird situation would start considering hypotheses like “gods messing with us” even if they have previously been very sure that no gods exist.
any sane person would be questioning -everything- in lieu of believing it happened by chance
That may very well be correct. But in so far as we frame that as “I don’t believe X happened because its probability is very low”, all that indicates is that we intuitively think about probabilities (or at least express our thinking about probabilities) wrongly. The thing that triggers such thoughts is the occurrence of a low-probability event that feels like it should have a better explanation, even if the thought we’re then inclined to think doesn’t explicitly have that last bit in it.
(It’s not necessarily that concrete better explanations occur to us. It’s that we have a heuristic that tells us there should be one. What I wrote before kinda equivocates between those, for which I apologize; I am not sure which I had in mind, but what I endorse after further thought is the latter, together with the observation that what makes this heuristic useful is the fact that its telling us “there should be a better explanation” correlates with there actually being one.)
From the “actually” in the second, it seems as if you’re endorsing that one, in which case presumably “a very good mathematical explanation” is intended as a criticism. Which doesn’t make any sense to me.
I was implying that it is a rationalization. Perhaps a fair one—I have no ready counterargument available—but not the real reason for the behavior.
It’s not necessarily that concrete better explanations occur to us. It’s that we have a heuristic that tells us there should be one.
Yes! Exactly. And moreover—that heuristic is, as you say, useful. What is the heuristic measuring, and why?
Skipping ahead a bit: The ability to notice which improbable things require explanations is, perhaps, the heart of scientific progress (think of data mining—why can’t we just run a data mining rig and discover the fundamental equations of the universe? I’d bet all the necessary data already exists to improve our understanding of reality by as much again as the difference between Newtonian and Relativistic understandings of reality). Why does it work, and how can we make it work better?
the interesting thing here is that we’d start looking for explanations of an all-heads sequence, even though it’s no more improbable than any other sequence.
It’s no more probable under the null hypothesis, but much more probable under more probable than average alternative hypotheses.
It’s no more probable under the null hypothesis, but much more probable under more probable than average alternative hypotheses.
Such as gods interfering with our lives?
Imagine, for a moment, you’ve ruled out all of the probable explanations. Are you still going to be looking for an alternative explanation, or will you accept that it’s chance?
Or the coin being cheat, or some cheating or “non-random” effect in the situation. Delusional recollection of events.
How did I “rule out” the alternatives? When I imagine me doing that, I imagine me reasoning poorly. I go by Jaynes’ policy of having a catch all “something I don’t understand” hypothesis for multiple hypothesis testing. In this case, it would be “some agent action I can’t detect or don’t understand the mechanism of”. How did I rule that out?
Suppose it’s 1,000,000 coin flips, all heads. The probability of that is pretty damn low, and much much lower than my estimates for the alternatives, including the “something else” hypothesis. You can make some of that up with a sampling argument about all the “coin flip alternatives” one sees in a day, but that only takes you so far.
I don’t see how I would ever be confident that 1,000,000 came up all heads with “fair” coin flipping.
The probability of any specific sequence of 1M coin flips is “pretty damn low” in the same sense. The relevant thing here is not that that probability is low when they’re all heads, but that the probability of some varieties of “something else” is very large, relative to that low probability. Or, more precisely, what sets us thinking of “something else” hypotheses is some (unknown) heuristic that tells us that it looks like the probability of “something else” should be much bigger than the probability of chance.
(I guess the heuristic looks for excessive predictability. As a special case it will tend to notice things like regular repetition and copies of other sequences you’re familiar with.)
It is not true that overall all sequences are equally likely. The probability of a certain sequence is the probability that it would happen by chance added to the probability that it would happen by not-chance. As gjm said in his comment, the chance part is equal, but the non-chance part is not. So there is no reason why the total probability of all sequences would be equal. The total probability of a sequence of 100 heads is higher than most other sequences. For example, there is the non-chance method of just talking about a sequence without actually getting it. We’re doing that now, and note that we’re talking about the sequence of all heads. That was far more likely given this method of choosing a sequence, then an individual random looking sequence.
(But you are right that it is no more improbable than other sequences. It is less improbable overall, and that is precisely why we start looking for another explanation.)
No, that’s a very good reason to start looking for another explanation, but somebody with no understanding of Bayes’ Rule at all would do exactly the same thing. If somebody else would engage in exactly the same behavior with a radically different explanation for that behavior, given a particular stimulus—consider the possibility that your explanation for your behavior is not the real reason for your behavior.
Ok. I think part of the issue [ETA: with our mutual understanding of each other, not with you] is that you’re focused on the “You’re lying” part of the conversation.
I’m considering it in the context of this: “My observations are always fallible, and if you make an event improbable enough, why shouldn’t I be skeptical even if I think I observed it?”
Granted, his observations have N bits of information (at least), the same as the situation with cheating, and it’s at least as improbable that he’d observe a given sequence of length N when something else entirely happened, than that the given sequence of length N itself happened, so in practice, it’s still -certainly- more likely that he actually observed the observation he observed.
The paradox isn’t there. The paradox is that we would, in fact, find some sequences unbelievable, even though they’re exactly as likely as every other sequence. If the sequence was all heads 100 times in a row, for instance, that would be unbelievable, even though a sequence of pure heads is exactly as likely as any other sequence.
The paradox is in the fact that the sequence is undefined, and for some sequences, we’d be inclined to side with Alice, and for other sequences, we’d be inclined to side with Bob, even though all possible sequences of the same length are equally likely.
ETA:
This is what I was getting at with the difference between the reference classes of “distinguished” and “undistinguished”.
You should. You should be aware that you might e.g. have made a mistake and slightly misremembered (or miscopied, etc.) the results of the coin flips, for instance.
We might say that. We might even think it. But what we ought to mean is that we find other explanations more plausible than chance in those cases. If you flip a coin 100 times and get random-looking results: sure, those particular results are very improbable, but very improbable things happen all the time (as in fact you can demonstrate by flipping a coin 100 times). What you should generally be looking at is not probabilities but odds. That random-looking sequence is neither much more nor much less likely than any other random-looking sequence of 100 coin-flips, so the fact that it’s improbable doesn’t give you reason to disbelieve it—you don’t have a better rival hypothesis. But if you flip all heads, suddenly there are higher-probability alternatives. Not because all-heads is especially unlikely by chance, but because it’s especially likely by not-chance. Maybe the coin is double-headed. Maybe it’s weighted in some clever way[1]. Maybe you’re hallucinating or dreaming. Maybe some god is having a laugh. All these things are (so at least it seems) much more likely to produce all-heads than a random-looking sequence.
[1] I think I recall seeing an analysis somewhere that found that actually weighting a coin can’t bias its results much.
Which is, I think, what is interesting about this: All-heads is no more improbable than any other random sequence, but in the case of an all-heads sequence, suddenly we start looking for laughing gods, hallucinations, or dreams as an explanation.
Which is to say, the interesting thing here is that we’d start looking for explanations of an all-heads sequence, even though it’s no more improbable than any other sequence.
No—not “suddenly we start looking for”. Suddenly those are better explanations than if the sequence of coin flips had been random-looking.
Like gods having a laugh?
You didn’t, and wouldn’t, leap into the better explanations. You leapt fully into any explanation except chance, without regard for whether or not it was a better explanation.
Gods having a laugh aren’t something you even think of if you aren’t looking for an explanation.
Gods having a laugh are a pretty terrible explanation for anything, and their inclusion here was mostly gjm having a laugh.
The borderline between “suddenly we start looking for a better explanation” and “suddenly better explanations start occurring to us” is an extremely fuzzy one. My reason for preferring the latter framing is that what’s changed isn’t that randomness has become worse at explaining our observations, but that some non-random explanation has got better.
One is a very good mathematical explanation.
The other is why “Gods having a laugh” would actually cross your mind. You include that as a joke because it rings true.
My apologies for being dim: what are “one” and “the other” here?
How do you know why I did it? (I say: you don’t know why I did it, you’re just pretending to. That’s as rude as it is foolish.)
Suddenly looking for explanations, versus explanations suddenly begin occurring to us.
Because of how humor works. It depends upon a shared/common experience. You not only expect to think of gods laughing at you, in that situation—because you’ve thought of exactly that in similar weird circumstances in your life—you expect me to think of gods laughing at me, in that situation. (And gods laughing at me would, in fact, be something I considered given a long-enough sequence of all-Heads, so the joke didn’t fall flat. I’ve thought of some equivalent of gods laughing at me for far less unusual coincidences, after all.)
I didn’t need you to tell me it was a joke, however. I knew that explanation would occur to you in the real world before you ever mentioned it—because 100 heads in a row would be, quite simply, unbelievable, and any sane person would be questioning -everything- in lieu of believing it happened by chance—even though any other random sequence is just as unlikely. It’s just how our brains work.
OK, that’s what I first thought. But then I can’t make sense of what you say about these: “One is a very good mathematical explanation” and “the other is why ‘Gods having a laugh’ would actually cross your mind”. From the “actually” in the second, it seems as if you’re endorsing that one, in which case presumably “a very good mathematical explanation” is intended as a criticism. Which doesn’t make any sense to me.
But your analysis on the basis of “how humor works” doesn’t give any reason at all for any preference between “suddenly start looking for explanations” and “explanations start occurring to us”. It hinges only on the fact that, one way or another, many people in such a weird situation would start considering hypotheses like “gods messing with us” even if they have previously been very sure that no gods exist.
That may very well be correct. But in so far as we frame that as “I don’t believe X happened because its probability is very low”, all that indicates is that we intuitively think about probabilities (or at least express our thinking about probabilities) wrongly. The thing that triggers such thoughts is the occurrence of a low-probability event that feels like it should have a better explanation, even if the thought we’re then inclined to think doesn’t explicitly have that last bit in it.
(It’s not necessarily that concrete better explanations occur to us. It’s that we have a heuristic that tells us there should be one. What I wrote before kinda equivocates between those, for which I apologize; I am not sure which I had in mind, but what I endorse after further thought is the latter, together with the observation that what makes this heuristic useful is the fact that its telling us “there should be a better explanation” correlates with there actually being one.)
I was implying that it is a rationalization. Perhaps a fair one—I have no ready counterargument available—but not the real reason for the behavior.
Yes! Exactly. And moreover—that heuristic is, as you say, useful. What is the heuristic measuring, and why?
Skipping ahead a bit: The ability to notice which improbable things require explanations is, perhaps, the heart of scientific progress (think of data mining—why can’t we just run a data mining rig and discover the fundamental equations of the universe? I’d bet all the necessary data already exists to improve our understanding of reality by as much again as the difference between Newtonian and Relativistic understandings of reality). Why does it work, and how can we make it work better?
It’s no more probable under the null hypothesis, but much more probable under more probable than average alternative hypotheses.
Such as gods interfering with our lives?
Imagine, for a moment, you’ve ruled out all of the probable explanations. Are you still going to be looking for an alternative explanation, or will you accept that it’s chance?
Or the coin being cheat, or some cheating or “non-random” effect in the situation. Delusional recollection of events.
How did I “rule out” the alternatives? When I imagine me doing that, I imagine me reasoning poorly. I go by Jaynes’ policy of having a catch all “something I don’t understand” hypothesis for multiple hypothesis testing. In this case, it would be “some agent action I can’t detect or don’t understand the mechanism of”. How did I rule that out?
Suppose it’s 1,000,000 coin flips, all heads. The probability of that is pretty damn low, and much much lower than my estimates for the alternatives, including the “something else” hypothesis. You can make some of that up with a sampling argument about all the “coin flip alternatives” one sees in a day, but that only takes you so far.
I don’t see how I would ever be confident that 1,000,000 came up all heads with “fair” coin flipping.
It’s a fair coin. It just has two heads on it.
The probability of any specific sequence of 1M coin flips is “pretty damn low” in the same sense. The relevant thing here is not that that probability is low when they’re all heads, but that the probability of some varieties of “something else” is very large, relative to that low probability. Or, more precisely, what sets us thinking of “something else” hypotheses is some (unknown) heuristic that tells us that it looks like the probability of “something else” should be much bigger than the probability of chance.
(I guess the heuristic looks for excessive predictability. As a special case it will tend to notice things like regular repetition and copies of other sequences you’re familiar with.)
It is not true that overall all sequences are equally likely. The probability of a certain sequence is the probability that it would happen by chance added to the probability that it would happen by not-chance. As gjm said in his comment, the chance part is equal, but the non-chance part is not. So there is no reason why the total probability of all sequences would be equal. The total probability of a sequence of 100 heads is higher than most other sequences. For example, there is the non-chance method of just talking about a sequence without actually getting it. We’re doing that now, and note that we’re talking about the sequence of all heads. That was far more likely given this method of choosing a sequence, then an individual random looking sequence.
(But you are right that it is no more improbable than other sequences. It is less improbable overall, and that is precisely why we start looking for another explanation.)
No, that’s a very good reason to start looking for another explanation, but somebody with no understanding of Bayes’ Rule at all would do exactly the same thing. If somebody else would engage in exactly the same behavior with a radically different explanation for that behavior, given a particular stimulus—consider the possibility that your explanation for your behavior is not the real reason for your behavior.