From a Bayesian point of view, drawing a random sample from all humans who have ever or will ever exist is just not a well-defined operation until after humanity is extinct. Trying it before then violates causality: performing it requires reliable access to information about events that have not yet happened. So that’s an invalid choice of prior.
I think this makes too many operations ill-defined, given that probability is an important tool for reasoning about events that have not yet happened. Consider for example, the question “what is the probability that one of my grandchildren, selected uniformly at random, is female, conditional on my having at least one grandchild?”. From the perspective of this quote, a random sample from all grandchildren that will ever exist is not a well-defined operation until I and all of my children die. That seems wrong.
In that particular case, you can freely drop the “selected uniformly at random” part – the answer is also the same if we specify the first grandchild, the last, or even the closest-to-middle-in-birth-order. (Note that this would be different if there was, say, a well-documented tendency for fourth-grandchildren to get gender changes to male.) So while you’re technically introducing a need for precognition into the problem by selecting a random grandchild, this particular problem has a property that makes the result not depend upon this precognitive data — the information from the future you introduced has no actual effect, so the answer remains well-defined.
Note that selecting randomly from hypothetical future events predicated on a current model of the future is also fine: the problem happens only when you introduce an actual dependency on unknown and unpredictable future events. Like a dependency on whether we do in fact get to colonize the stars or not that shifts the distribution by a very large factor. For causality to get violated, you need an actual information path from the future into your result.
Fun fact: younger parents tend to produce more males, so the first grand-child is more likely to be male, because its parents are more likely to be younger. Unclear whether the effect is due to birth order, maternal age, paternal age, or some combination. From Wikipedia (via Claude):
These studies suggest that the human sex ratio, both at birth and as a population matures, can vary significantly according to a large number of factors, such as paternal age, maternal age, multiple births, birth order, gestation weeks, race, parent’s health history, and parent’s psychological stress.
If that’s too subtle, we could look at a question like “what is the probability that one of my grandchildren, selected uniformly at random, is a firstborn, conditional on my having at least one grandchild?” where the answer is clearly different if we specify the first grandchild or the last. Or we could ask a question that parallels the Doomsday Argument, while being different: “what is the probability that one of my descendants, selected uniformly at random, is in the earliest 0.1% of all my descendants?”
If we have a statistical model for how many grandchildren there will be then we can consistently make a prediction in advance that includes a random sampling process across all predicted future grandchildren. Doing that doesn’t violate causality—that’s just making a prediction.
We can consistently sample over that actual grandchildren in retrospect, once we know how many of them there were. That also doesn’t violate causality.
Anywhere between those two possibilities, we’re dealing with uncertainty and incomplete evidence, so we need to use Bayesianism (not precognition).
Suppose we can confidently predict that there will be either O(1) grandchild (because we don’t go to the stars), or O(10^6) grandchildren (because we do, and for convenience assume the latter will take a while to all be born, it’s not rapid-fire one every few hours), but we don’t have any idea of the relative chance of these two plausible outcomes (because we don’t know if the stars are owned by Grabby aliens or not). Then at this point we actually have two hypotheses:
a. There will be O(1) grandchild, so the current grandchild is very likely turn out to be the one randomly selected (if we for some reason chose to randomly select one, maybe it was stipulated in a will or something, which we can only legitimately do once they’ve definitely all been born and we actually know what size of dice to roll)
b. There will be O(10^6) grandchildren, so the current grandchild has around a 0.0001% chance of eventually being the randomly selected one
At this point, one grandchild in, we currently don’t have enough evidence to distinguish these two hypotheses (we’ve seen one grandchild so far, and both hypotheses make the same prediction that seeing at least one grandchild is likely). So our current priors remain unchanged from whatever arbitrary priors we originally started with. If we had, say, picked the uniform prior that both of these hypotheses seem, in the absence of any evidence, about equally likely, then our current estimate of the chance of the first grandchild also being the randomly selected grandchild is ~50% (50% chance of a sure thing because we’re not going to the stars plus 50% chance of very unlikely because we are). So, 50%: not particularly implausible. Pretty good currently estimated odds they’ll win the dice roll (though obviously it depends mostly on how Grabby those Aliens may be). Nothing clearly implausibly unlikely or atypical has occurred so far. Our best guess is that the current grandchild has a reasonable chance of being pretty typical of all the grandchildren, according to our current understanding of the world. But this still tells us absolutely nothing about the probability of b. being correct: saying “50% is a long way from 0.0001% so we magically know in advance that b. must be wrong so we don’t get to go to the stars” is just fallacious. That’s not how Bayesianism works. You do the calculation separately under each hypothesis, then you combine the answers weighted according to the currently estimated probability of the corresponding hypothesis being true. It would be equally fallacious to claim that the “the expected number of grandchildren is currently 0.5 x 10^6 + 0.5 x 1 ~= 500,000, so the probability that the current grandchild is going to the randomly selected one only is 0.0002%, so clearly there are almost certainly to be many more grandchildren, so b. must be true” (what one might call the anti-Doomsday Argument, which one hears less often, but makes just as little sense as the original). Either of those are mixing Frequentism and Bayesianism in an invalid way to get a nonsensical answer.
Seriously, just use Bayesianism. You are so much less likely to get confused by weird paradoxes if you know how it works and just use it. It’s not that complicated: you can learn it in about a day, it requires only basic arithmetic, and It’s simply a mathematical formulation of the intuitions of Scientific Method, which most people learnt in school. This really out to be taught as high school logic, but it isn’t.
I was already asking from a Bayesian perspective. I was asking about this quote:
From a Bayesian point of view, drawing a random sample from all humans who have ever or will ever exist is just not a well-defined operation until after humanity is extinct. Trying it before then violates causality: performing it requires reliable access to information about events that have not yet happened. So that’s an invalid choice of prior.
Based on your latest comment, I think you’re saying that it’s okay to have a Bayesian prediction of possible futures, and to use that to make predictions about the properties of a random sample from all humans who have ever or will ever exist. But then I don’t know what you’re saying in the quoted sentences.
Edited to add: which is fine, it’s not key to your overall argument.
Yes, performing a predicted random sample over predicted future humans according to some model, or Bayesian distribution of models is fine — but in the case of the Bayesian model distribution case, if you have large uncertainty within your hypothesis distribution about how many there will be, that will dominate the results. What breaks causality is attempting to perform an actual random sample over the actual eventual number of future humans before that information is actually available, and then using frequentist typicality arguments based on that hypothetical invalid sampling process to try to smuggle information from the future into updating your hypothesis distribution.
I think this makes too many operations ill-defined, given that probability is an important tool for reasoning about events that have not yet happened. Consider for example, the question “what is the probability that one of my grandchildren, selected uniformly at random, is female, conditional on my having at least one grandchild?”. From the perspective of this quote, a random sample from all grandchildren that will ever exist is not a well-defined operation until I and all of my children die. That seems wrong.
In that particular case, you can freely drop the “selected uniformly at random” part – the answer is also the same if we specify the first grandchild, the last, or even the closest-to-middle-in-birth-order. (Note that this would be different if there was, say, a well-documented tendency for fourth-grandchildren to get gender changes to male.) So while you’re technically introducing a need for precognition into the problem by selecting a random grandchild, this particular problem has a property that makes the result not depend upon this precognitive data — the information from the future you introduced has no actual effect, so the answer remains well-defined.
Note that selecting randomly from hypothetical future events predicated on a current model of the future is also fine: the problem happens only when you introduce an actual dependency on unknown and unpredictable future events. Like a dependency on whether we do in fact get to colonize the stars or not that shifts the distribution by a very large factor. For causality to get violated, you need an actual information path from the future into your result.
Fun fact: younger parents tend to produce more males, so the first grand-child is more likely to be male, because its parents are more likely to be younger. Unclear whether the effect is due to birth order, maternal age, paternal age, or some combination. From Wikipedia (via Claude):
If that’s too subtle, we could look at a question like “what is the probability that one of my grandchildren, selected uniformly at random, is a firstborn, conditional on my having at least one grandchild?” where the answer is clearly different if we specify the first grandchild or the last. Or we could ask a question that parallels the Doomsday Argument, while being different: “what is the probability that one of my descendants, selected uniformly at random, is in the earliest 0.1% of all my descendants?”
If we have a statistical model for how many grandchildren there will be then we can consistently make a prediction in advance that includes a random sampling process across all predicted future grandchildren. Doing that doesn’t violate causality—that’s just making a prediction.
We can consistently sample over that actual grandchildren in retrospect, once we know how many of them there were. That also doesn’t violate causality.
Anywhere between those two possibilities, we’re dealing with uncertainty and incomplete evidence, so we need to use Bayesianism (not precognition).
Suppose we can confidently predict that there will be either O(1) grandchild (because we don’t go to the stars), or O(10^6) grandchildren (because we do, and for convenience assume the latter will take a while to all be born, it’s not rapid-fire one every few hours), but we don’t have any idea of the relative chance of these two plausible outcomes (because we don’t know if the stars are owned by Grabby aliens or not). Then at this point we actually have two hypotheses:
a. There will be O(1) grandchild, so the current grandchild is very likely turn out to be the one randomly selected (if we for some reason chose to randomly select one, maybe it was stipulated in a will or something, which we can only legitimately do once they’ve definitely all been born and we actually know what size of dice to roll)
b. There will be O(10^6) grandchildren, so the current grandchild has around a 0.0001% chance of eventually being the randomly selected one
At this point, one grandchild in, we currently don’t have enough evidence to distinguish these two hypotheses (we’ve seen one grandchild so far, and both hypotheses make the same prediction that seeing at least one grandchild is likely). So our current priors remain unchanged from whatever arbitrary priors we originally started with. If we had, say, picked the uniform prior that both of these hypotheses seem, in the absence of any evidence, about equally likely, then our current estimate of the chance of the first grandchild also being the randomly selected grandchild is ~50% (50% chance of a sure thing because we’re not going to the stars plus 50% chance of very unlikely because we are). So, 50%: not particularly implausible. Pretty good currently estimated odds they’ll win the dice roll (though obviously it depends mostly on how Grabby those Aliens may be). Nothing clearly implausibly unlikely or atypical has occurred so far. Our best guess is that the current grandchild has a reasonable chance of being pretty typical of all the grandchildren, according to our current understanding of the world. But this still tells us absolutely nothing about the probability of b. being correct: saying “50% is a long way from 0.0001% so we magically know in advance that b. must be wrong so we don’t get to go to the stars” is just fallacious. That’s not how Bayesianism works. You do the calculation separately under each hypothesis, then you combine the answers weighted according to the currently estimated probability of the corresponding hypothesis being true. It would be equally fallacious to claim that the “the expected number of grandchildren is currently 0.5 x 10^6 + 0.5 x 1 ~= 500,000, so the probability that the current grandchild is going to the randomly selected one only is 0.0002%, so clearly there are almost certainly to be many more grandchildren, so b. must be true” (what one might call the anti-Doomsday Argument, which one hears less often, but makes just as little sense as the original). Either of those are mixing Frequentism and Bayesianism in an invalid way to get a nonsensical answer.
Seriously, just use Bayesianism. You are so much less likely to get confused by weird paradoxes if you know how it works and just use it. It’s not that complicated: you can learn it in about a day, it requires only basic arithmetic, and It’s simply a mathematical formulation of the intuitions of Scientific Method, which most people learnt in school. This really out to be taught as high school logic, but it isn’t.
I was already asking from a Bayesian perspective. I was asking about this quote:
Based on your latest comment, I think you’re saying that it’s okay to have a Bayesian prediction of possible futures, and to use that to make predictions about the properties of a random sample from all humans who have ever or will ever exist. But then I don’t know what you’re saying in the quoted sentences.
Edited to add: which is fine, it’s not key to your overall argument.
Yes, performing a predicted random sample over predicted future humans according to some model, or Bayesian distribution of models is fine — but in the case of the Bayesian model distribution case, if you have large uncertainty within your hypothesis distribution about how many there will be, that will dominate the results. What breaks causality is attempting to perform an actual random sample over the actual eventual number of future humans before that information is actually available, and then using frequentist typicality arguments based on that hypothetical invalid sampling process to try to smuggle information from the future into updating your hypothesis distribution.