Yes, I get it, I’m very ignorant. (If you needed to get that off your chest, you could perhaps have said it directly in one sentence, rather than spending 10000 words patiently implying it.) But you’re still handwaving the interesting parts.
Obviously “I am in the first 10% of people” is a prediction; I already agreed to rephrase it as “I will eventually turn out to have been in the first 10% of people”. I’m not trying to deduce anything from the fact that it ‘sounds implausible’, and I’m not trying to bring any information back in time from the moment it turns out to be true or false in my case. I’m noting that it will definitely turn out to be false from the perspective of 90% of people who ever live, and asking why *this* fact is obviously irrelevant to the credence I should give it.
The answer is not “bayesianism, obviously”. Bostrom, even back when he was writing about this stuff, was not a heathen frequentist, and he wasn’t as stupid as me. (I’m pretty sure he’d even heard of causality.)
I am very sorry. I have clearly upset you, which was not my intention. I apologize.
Having carefully reread our entire thread, with some help from Claude, I’m afraid I was interspersing talking to you with multiple other people who were mostly asking questions about Bayesianism 101. I thus reverted to lecturing mode. You were asking something more complex, and I was puzzled by what you were asking, given that the conversation had started out with me agreeing with you, and I thus started resorting to giving ever longer and more basic lecturing explanations in the hope they would cover whatever you were asking about, since I was unable to figure out what the point of disagreement was. I’m now going to go back and reread it again, and see if I can figure out what you were actually asking and whether I In fact have an answer.
Bostrom, even back when he was writing about this stuff, was not a heathen frequentist, and he wasn’t as stupid as me.
Until you mentioned this and I went and did some research, I was unaware that Nick Bostrom had written about anthropic reasoning and the Doomsday Argument — I’ve only read his later book on Superintelligence. If what Claude is now telling me is correct, then I gather Bostrom analyzed the Doomsday and raised some possible objections to it, but not, Claude tells me, the causality-based one I’ve made here. However, since all I know of Bostrom’s writing on the subject is a short summary from an LLM, I’m really not in a position to comment as to whether, or if so why, he didn’t reach the conclusion that seems rather obvious to me, that if you attempt to translate the Doomsday Argument into a Bayesian framework it clearly violates causality and is thus a fallacy.
Summarizing Claude, it summarized Bostrom like this to me:
Bostrom suggests two possible viewpoints:
Self-Sampling Assumption — Bostrom’s term. It’s the principle that “you should reason as if you’re a random sample from the set of all observers in your reference class.” It’s one of the two competing frameworks Bostrom laid out, the other being SIA (Self-Indication Assumption).
Self-Indication Assumption: “You should reason as if your existence is more likely under hypotheses that predict more observers.” In other words, the mere fact that you exist is evidence favoring hypotheses with larger populations.
The first, he suggests, implies the Doomsday Argument, the second its inverse that Doom is very unlikely (I don’t know what people call this, so “The No-Doomsday Argument” will have to do.)
For sake of argument, I’m going to assume Claude has this summary roughly right, rather than going out and buying Bostrom’s book and then reading it to double-check. (So, yes, I am choosing not to go read a book on the subject, and am aware of the irony involved in that choice.)
Of those, I agree with the first one EXCEPT I think the definition of the reference class has to include causality and everything we actually know (and not anything we don’t know), because Bayesianism is always about P(X | everything I know) — which was the entire point of my joke. So I cannot validly define a reference class to reason probabilistically as if someone was sampling over, that includes observers in the future (or indeed ones in parallel universes or on alien worlds or whatever) whose existence or otherwise I am unable to predict with any accuracy because my prediction of them existing or not varies significantly across different hypotheses that I still have significantly greater than zero priors for each of. To give another example, “all sapient observers in the Milky Way galaxy during the first 13.8 billion years or so, specifically in the backward light-cone of Earth now” is also an invalid reference class, even though by construction it carefully lies our past so doesn’t breach causality: it’s assuming information we don’t have, about how often life arises and evolves to sapience, i.e. some of the terms in the Drake Equation that we’re just wildly uncertain about because we have a sample size of 1 and that sample has to be discarded as due to sampling effects, since we’re here to observe it. (The lack of visible Dyson swarms, obvious signals, or alien delegations or invaders suggests some maximum bounds on the Drake Equation, but they don’t constrain it very tightly, and they only impose a maximum, not a minimum.)
In fact, for the Doomsday Argument at this particular point in history, with the current unclear existential risk level, my current prior is pretty much still my initial prior, i.e. I don’t have even a clue, while the size of the reference classes proposed some of by the different hypotheses involved in the Doomsday Argument differs by a large number of orders of magnitude between different hypotheses that I have significant current priors for all of, including the ones I earlier called “Doom” and “Stars” (and others such as “stays on Earth for another few million years”). So I’m currently about as unsure of the size of the reference class as it’s possible to be. Thus the probability of getting something like my birth order number if you were to sample over the reference class is just wildly uncertain. Thus I have to do that reasoning process separately conditioned on the two-or-more different hypotheses, and combine the results in the standard Bayesian way. Which means that the results can’t affect the priors, since each one had to assume the corresponding hypothesis. There is an X% chance that something unlikely-sounding given the size of the reference class under that hypothesis has happened and a 100-X% chance that it hasn’t given the vastly different size of the reference class under that other hypothesis, but that doesn’t provide any evidence that I can update X% on, since fundamentally, I still don’t know the size of the reference class, and if I try to reason as if I did, I would be smuggling precognition into my argument.
So basically, in this case, where we don’t have even a clue, I accept “all humans who have lived up until this point, or will be born in the next year or two” as a valid reference class to reason probabilistically over (i.e. “reason just like a Frequentist would”), because I already have reasonably firm evidence that they have in fact existed, or are very likely to exist, so assuming that in the probabilistic reasoning isn’t going to mess stuff up. However, given just how uncertain I currently am about what’s going to happen more than a few years from now, since there appears likely to be a Singularity in our near future, I consider “all humans who have lived up until this point, or will ever live” as an invalid reference class that is smuggling the results of precognition in to the argument if I try to reason probabilitically over it (without Bayesianliy conditioning that reasoning separately on the different hypotheses that control the size of the reference class — and if I do that, then there’s no update to the prior). So I can’t use that as a reference class in SSA.
So that’s what I think about what Claude tells me Bostrom said: I disagree with both the positions I’m told he outlined as alternatives, but I’m closer to SSA with a causal modification. My position is the one under which the Doomsday Argument and the “no-Doomsday Argument” are both fallacies. Because they both obviously have to be fallacies.
Note that my version of the SSA isn’t actually that useful: it’s basically a way of doing an internal consistency check on the hypothesis you believe. If you do it and it says “by your hypothesis, a huge coincidence has occurred” that suggests that going and trying to think of a new hypothesis that fits all your observed facts equally well and don’t make some aspect of it a huge coincidence might be a good idea. But if, say, 30 bits of coincidence have occurred by your current hypothesis (so a one in a billion fluke), that only supports 30 bits of additional hypothesis complexity that eliminates the coincidence — which isn’t that much, a basically few words or a smallish equation. Otherwise Occam’s Razor (minimizing Kolmogorov complexity) still wins.
I find that I am still explaining things in painful detail — fundamentally, because I’m not sure what question you’re asking by raising Bostrom.
You said:
Let’s suppose we’re not dualists. Then, if a million bazillion people exist(/have existed/will exist), and one of them is ‘me’, and I happen to have a bunch of unusual properties, there’s literally no coincidence to be explained: there aren’t two separate facts here, ‘person X is highly atyptical’ and ‘I am person X’, that are surprising in conjunction. There’s just the fact that all of these people exist (or have existed, or will exist), and they’re all seeing the world from their own perspectives, and inevitably one of them is seeing the world from person X’s perspective, and from that perspective ‘I’ refers to person X, and (given the existence of all those people, including person X) there’s no way things could have been otherwise.
To which I agreed:
You’re making the same point that I’m attempting to make in the second paragraph of my footnote: that doing a random drawing over all people ever is an invalid prior (until we’re extinct). As you say, the line of thinking makes no sense in the first place: it’s an invalid assumption, because it breaks causality: it’s assuming we know what will happen in the future when we actually have no more than a clue.
To put this yet another way:
P(there exists a 100 billionth person born| there will only ever be 101 billion people born) = 1
and
P(there exists a 100 billionth person born| there will eventually be quintillions of people born) = 1
so we can deduce exactly nothing about how many people will be born after us from the simple observation that we were born number 100 billionth (or so) and thus people #1 up to #100,000,000,000 have all existed. Under the quintillions or so hypothesis, it will later in retrospect turn out that all of us born so far were all in some sense very atypical: yes, someone had to be 100 billionth, but that’s still vastly closer to the state of the birth order than the end if the end is in the quintillions. But we don’t currently know that, and if it later turns out to be the case, so what? Someone had to be 100 billionth, whether that’s rather near the end or astoundingly near the beginning, it still occurs with probability 1 under both hypotheses, so there is no Bayesian update between the hypotheses when it happens.
Similarly, if a very shortsighted ant is walking along a ruler, and reaches the 1mm line, that tells it nothing about whether this is a 2mm-long ruler or a 10km-long ruler: it can’t see that far, so it has no evidence yet. It only shows it’s at least 1mm long, plus the small distance the ant can see (in chronological terms, as far as we can predict with any accuracy). You can’t make deductions about how much bigger the size of the reference class might be beyond the number of members you already know about.
I believe I am simply restating your argument here, since I agree with you.
Claude tells me that the essence of what you’ve been trying to ask me is:
I’m noting that “I will turn out to be in the first 10% of people who ever lived” will definitely turn out to be false from the perspective of 90% of people who ever live, and asking why this fact is obviously irrelevant to the credence I should give it.
You’re not allowed to use that fact because you don’t actually know if you’re in the 10% or the 90%. In the absence of that information, you don’t get to make a Bayesian update. To do Bayesian arguments correctly, you need to respect causality, and only use information you actually currently have access to.
Claude interpreted your question as base-rate reasoning, though when asked it admitted that you don’t use the phrase. I’m going to assume that it was correct. Base rates are normally a valid way to set your Bayesian prior. If you know “the base rate for people having disease X is Y%”, then a reasonable initial prior for whether a particular patient has that disease, in the absence of patient-specific evidence, is Y%. But for Doomsday, we have no information of the base rate of species inventing AI and surviving is. We have performed the experiment zero times so far, so we currently have no good way to set a current prior. So any argument that suggests that we do has to be a fallacy.
In particular, using what Bostrom calls the Self-Sampling Assumption using a reference class that we don’t know the size of in order to make a deduction about its size is an invalid circular argument: it’s assuming you know the answer to the question you’re trying to answer, in breach of causality. Yes, there will eventually be a last human alive, or an AI, or an alien archeologist who will know the answer, and will be able to tell, for any individual human, whether the statement “I will turn out to be in the first 10% of people who ever lived” was true or false for them, including for you and I. It’s a statement that will eventually have a truth value knowable at reasonable computational cost, but that doesn’t yet have one (at a computational cost less than running a quantum simulation of the entire Earth and everything causally connected to it faster than real-time, which as far as we know is physically impossible, and we’re certainly not in a position to do). In particular, that’s a statement that will turn out to have been true for everybody born until some point in time, and then false for everybody born after that point. So the “base rate” is initially 100%, dropping to 0% at some time that we don’t yet know. But since I don’t yet know that information, I can’t, in Bayesianism, update my priors now based on information that I don’t yet have and that will only exist (at less than vastly unreasonable computational cost that I haven’t paid) in the future, so I have no way to access yet. Bayesianism is about how to update your priors when you learn new information, and the Doomsday Argument is the Bayesian equivalent of trying to lift yourself up by your own bootstraps in the absence of any new information.
(I remain puzzled why this wasn’t obvious to Bostrom, assuming that he’s as familiar with Bayesianism as Claude makes it sound like he is. But then I’m puzzled why anyone falls for the Doomsday Argument: as soon as you notice it’s breaking causality, it seems obvious to me that it has to be a falacy, and the question then is where the flaw is. The answer is in an invalid choice of reference class. Or maybe I’m just a physicist and have had causal thinking drummed into me — though to be fair I’ve seen plenty of physicists abuse anthropic reasoning too, sometimes in acausal ways. I even wrote a joke about this.)
Was any of that a successful answer to the question you’ve been trying to ask me? Because I’m afraid that, even after rereading our conversation carefully, I have to admit I’m still unclear what you’re actually asking, or where we actually differ, if anywhere — so much so that I’m resorting to asking an LLM to tell me. Or, if none of that is an answer, then could you please accept my apologies for being confused, assume that I haven’t read Bostrom’s book on the subject, that as far as I can tell we agree with each other, that I have now (very belatedly, for which I again apologize) figured out that you are familiar with Bayesianism and that you just don’t see something that seems obvious to me about how to correctly apply it as being obvious, but that, if I still haven’t managed to answer your question despite multiple attempts, then I still have no clue what that specific thing is — and try to explain what exactly you are asking me to clarify about my position again, more slowly? It’s entirely possible that I’m the ignorant one here: I’m certainly puzzled as to what if anything we disagree about or why.
Alternatively, if you simply want to drop this rather long conversation here, then please feel entirely free. I’ve already upset you once, and I most certainly don’t want to do so again.
Yes, I get it, I’m very ignorant. (If you needed to get that off your chest, you could perhaps have said it directly in one sentence, rather than spending 10000 words patiently implying it.) But you’re still handwaving the interesting parts.
Obviously “I am in the first 10% of people” is a prediction; I already agreed to rephrase it as “I will eventually turn out to have been in the first 10% of people”. I’m not trying to deduce anything from the fact that it ‘sounds implausible’, and I’m not trying to bring any information back in time from the moment it turns out to be true or false in my case. I’m noting that it will definitely turn out to be false from the perspective of 90% of people who ever live, and asking why *this* fact is obviously irrelevant to the credence I should give it.
The answer is not “bayesianism, obviously”. Bostrom, even back when he was writing about this stuff, was not a heathen frequentist, and he wasn’t as stupid as me. (I’m pretty sure he’d even heard of causality.)
I am very sorry. I have clearly upset you, which was not my intention. I apologize.
Having carefully reread our entire thread, with some help from Claude, I’m afraid I was interspersing talking to you with multiple other people who were mostly asking questions about Bayesianism 101. I thus reverted to lecturing mode. You were asking something more complex, and I was puzzled by what you were asking, given that the conversation had started out with me agreeing with you, and I thus started resorting to giving ever longer and more basic lecturing explanations in the hope they would cover whatever you were asking about, since I was unable to figure out what the point of disagreement was. I’m now going to go back and reread it again, and see if I can figure out what you were actually asking and whether I In fact have an answer.
Until you mentioned this and I went and did some research, I was unaware that Nick Bostrom had written about anthropic reasoning and the Doomsday Argument — I’ve only read his later book on Superintelligence. If what Claude is now telling me is correct, then I gather Bostrom analyzed the Doomsday and raised some possible objections to it, but not, Claude tells me, the causality-based one I’ve made here. However, since all I know of Bostrom’s writing on the subject is a short summary from an LLM, I’m really not in a position to comment as to whether, or if so why, he didn’t reach the conclusion that seems rather obvious to me, that if you attempt to translate the Doomsday Argument into a Bayesian framework it clearly violates causality and is thus a fallacy.
Summarizing Claude, it summarized Bostrom like this to me:
Bostrom suggests two possible viewpoints:
The first, he suggests, implies the Doomsday Argument, the second its inverse that Doom is very unlikely (I don’t know what people call this, so “The No-Doomsday Argument” will have to do.)
For sake of argument, I’m going to assume Claude has this summary roughly right, rather than going out and buying Bostrom’s book and then reading it to double-check. (So, yes, I am choosing not to go read a book on the subject, and am aware of the irony involved in that choice.)
Of those, I agree with the first one EXCEPT I think the definition of the reference class has to include causality and everything we actually know (and not anything we don’t know), because Bayesianism is always about P(X | everything I know) — which was the entire point of my joke. So I cannot validly define a reference class to reason probabilistically as if someone was sampling over, that includes observers in the future (or indeed ones in parallel universes or on alien worlds or whatever) whose existence or otherwise I am unable to predict with any accuracy because my prediction of them existing or not varies significantly across different hypotheses that I still have significantly greater than zero priors for each of. To give another example, “all sapient observers in the Milky Way galaxy during the first 13.8 billion years or so, specifically in the backward light-cone of Earth now” is also an invalid reference class, even though by construction it carefully lies our past so doesn’t breach causality: it’s assuming information we don’t have, about how often life arises and evolves to sapience, i.e. some of the terms in the Drake Equation that we’re just wildly uncertain about because we have a sample size of 1 and that sample has to be discarded as due to sampling effects, since we’re here to observe it. (The lack of visible Dyson swarms, obvious signals, or alien delegations or invaders suggests some maximum bounds on the Drake Equation, but they don’t constrain it very tightly, and they only impose a maximum, not a minimum.)
In fact, for the Doomsday Argument at this particular point in history, with the current unclear existential risk level, my current prior is pretty much still my initial prior, i.e. I don’t have even a clue, while the size of the reference classes proposed some of by the different hypotheses involved in the Doomsday Argument differs by a large number of orders of magnitude between different hypotheses that I have significant current priors for all of, including the ones I earlier called “Doom” and “Stars” (and others such as “stays on Earth for another few million years”). So I’m currently about as unsure of the size of the reference class as it’s possible to be. Thus the probability of getting something like my birth order number if you were to sample over the reference class is just wildly uncertain. Thus I have to do that reasoning process separately conditioned on the two-or-more different hypotheses, and combine the results in the standard Bayesian way. Which means that the results can’t affect the priors, since each one had to assume the corresponding hypothesis. There is an X% chance that something unlikely-sounding given the size of the reference class under that hypothesis has happened and a 100-X% chance that it hasn’t given the vastly different size of the reference class under that other hypothesis, but that doesn’t provide any evidence that I can update X% on, since fundamentally, I still don’t know the size of the reference class, and if I try to reason as if I did, I would be smuggling precognition into my argument.
So basically, in this case, where we don’t have even a clue, I accept “all humans who have lived up until this point, or will be born in the next year or two” as a valid reference class to reason probabilistically over (i.e. “reason just like a Frequentist would”), because I already have reasonably firm evidence that they have in fact existed, or are very likely to exist, so assuming that in the probabilistic reasoning isn’t going to mess stuff up. However, given just how uncertain I currently am about what’s going to happen more than a few years from now, since there appears likely to be a Singularity in our near future, I consider “all humans who have lived up until this point, or will ever live” as an invalid reference class that is smuggling the results of precognition in to the argument if I try to reason probabilitically over it (without Bayesianliy conditioning that reasoning separately on the different hypotheses that control the size of the reference class — and if I do that, then there’s no update to the prior). So I can’t use that as a reference class in SSA.
So that’s what I think about what Claude tells me Bostrom said: I disagree with both the positions I’m told he outlined as alternatives, but I’m closer to SSA with a causal modification. My position is the one under which the Doomsday Argument and the “no-Doomsday Argument” are both fallacies. Because they both obviously have to be fallacies.
Note that my version of the SSA isn’t actually that useful: it’s basically a way of doing an internal consistency check on the hypothesis you believe. If you do it and it says “by your hypothesis, a huge coincidence has occurred” that suggests that going and trying to think of a new hypothesis that fits all your observed facts equally well and don’t make some aspect of it a huge coincidence might be a good idea. But if, say, 30 bits of coincidence have occurred by your current hypothesis (so a one in a billion fluke), that only supports 30 bits of additional hypothesis complexity that eliminates the coincidence — which isn’t that much, a basically few words or a smallish equation. Otherwise Occam’s Razor (minimizing Kolmogorov complexity) still wins.
I find that I am still explaining things in painful detail — fundamentally, because I’m not sure what question you’re asking by raising Bostrom.
You said:
To which I agreed:
To put this yet another way:
P(there exists a 100 billionth person born| there will only ever be 101 billion people born) = 1
and
P(there exists a 100 billionth person born| there will eventually be quintillions of people born) = 1
so we can deduce exactly nothing about how many people will be born after us from the simple observation that we were born number 100 billionth (or so) and thus people #1 up to #100,000,000,000 have all existed. Under the quintillions or so hypothesis, it will later in retrospect turn out that all of us born so far were all in some sense very atypical: yes, someone had to be 100 billionth, but that’s still vastly closer to the state of the birth order than the end if the end is in the quintillions. But we don’t currently know that, and if it later turns out to be the case, so what? Someone had to be 100 billionth, whether that’s rather near the end or astoundingly near the beginning, it still occurs with probability 1 under both hypotheses, so there is no Bayesian update between the hypotheses when it happens.
Similarly, if a very shortsighted ant is walking along a ruler, and reaches the 1mm line, that tells it nothing about whether this is a 2mm-long ruler or a 10km-long ruler: it can’t see that far, so it has no evidence yet. It only shows it’s at least 1mm long, plus the small distance the ant can see (in chronological terms, as far as we can predict with any accuracy). You can’t make deductions about how much bigger the size of the reference class might be beyond the number of members you already know about.
I believe I am simply restating your argument here, since I agree with you.
Claude tells me that the essence of what you’ve been trying to ask me is:
You’re not allowed to use that fact because you don’t actually know if you’re in the 10% or the 90%. In the absence of that information, you don’t get to make a Bayesian update. To do Bayesian arguments correctly, you need to respect causality, and only use information you actually currently have access to.
Claude interpreted your question as base-rate reasoning, though when asked it admitted that you don’t use the phrase. I’m going to assume that it was correct. Base rates are normally a valid way to set your Bayesian prior. If you know “the base rate for people having disease X is Y%”, then a reasonable initial prior for whether a particular patient has that disease, in the absence of patient-specific evidence, is Y%. But for Doomsday, we have no information of the base rate of species inventing AI and surviving is. We have performed the experiment zero times so far, so we currently have no good way to set a current prior. So any argument that suggests that we do has to be a fallacy.
In particular, using what Bostrom calls the Self-Sampling Assumption using a reference class that we don’t know the size of in order to make a deduction about its size is an invalid circular argument: it’s assuming you know the answer to the question you’re trying to answer, in breach of causality. Yes, there will eventually be a last human alive, or an AI, or an alien archeologist who will know the answer, and will be able to tell, for any individual human, whether the statement “I will turn out to be in the first 10% of people who ever lived” was true or false for them, including for you and I. It’s a statement that will eventually have a truth value knowable at reasonable computational cost, but that doesn’t yet have one (at a computational cost less than running a quantum simulation of the entire Earth and everything causally connected to it faster than real-time, which as far as we know is physically impossible, and we’re certainly not in a position to do). In particular, that’s a statement that will turn out to have been true for everybody born until some point in time, and then false for everybody born after that point. So the “base rate” is initially 100%, dropping to 0% at some time that we don’t yet know. But since I don’t yet know that information, I can’t, in Bayesianism, update my priors now based on information that I don’t yet have and that will only exist (at less than vastly unreasonable computational cost that I haven’t paid) in the future, so I have no way to access yet. Bayesianism is about how to update your priors when you learn new information, and the Doomsday Argument is the Bayesian equivalent of trying to lift yourself up by your own bootstraps in the absence of any new information.
(I remain puzzled why this wasn’t obvious to Bostrom, assuming that he’s as familiar with Bayesianism as Claude makes it sound like he is. But then I’m puzzled why anyone falls for the Doomsday Argument: as soon as you notice it’s breaking causality, it seems obvious to me that it has to be a falacy, and the question then is where the flaw is. The answer is in an invalid choice of reference class. Or maybe I’m just a physicist and have had causal thinking drummed into me — though to be fair I’ve seen plenty of physicists abuse anthropic reasoning too, sometimes in acausal ways. I even wrote a joke about this.)
Was any of that a successful answer to the question you’ve been trying to ask me? Because I’m afraid that, even after rereading our conversation carefully, I have to admit I’m still unclear what you’re actually asking, or where we actually differ, if anywhere — so much so that I’m resorting to asking an LLM to tell me. Or, if none of that is an answer, then could you please accept my apologies for being confused, assume that I haven’t read Bostrom’s book on the subject, that as far as I can tell we agree with each other, that I have now (very belatedly, for which I again apologize) figured out that you are familiar with Bayesianism and that you just don’t see something that seems obvious to me about how to correctly apply it as being obvious, but that, if I still haven’t managed to answer your question despite multiple attempts, then I still have no clue what that specific thing is — and try to explain what exactly you are asking me to clarify about my position again, more slowly? It’s entirely possible that I’m the ignorant one here: I’m certainly puzzled as to what if anything we disagree about or why.
Alternatively, if you simply want to drop this rather long conversation here, then please feel entirely free. I’ve already upset you once, and I most certainly don’t want to do so again.