cupholder comments on Avoiding doomsday: a “proof” of the self-indication assumption

cupholder 7 Apr 2010 4:40 UTC
0 points
Saw this come up in Recent Comments, taking the opportunity to simultaneously test the image markup and confirm Academian’s Bayesian answer using boring old frequentist probability. Hope this isn’t too wide… (Edit: yup, too wide. Here’s a smaller-albeit-busier-looking version.)
What links here?
- Academian's comment on Avoiding doomsday: a “proof” of the self-indication assumption by Stuart_Armstrong (16 Apr 2010 10:16 UTC; 2 points)
- Academian's comment on Avoiding doomsday: a “proof” of the self-indication assumption by Stuart_Armstrong (7 Apr 2010 11:49 UTC; 0 points)
- Mallah 7 Apr 2010 13:35 UTC
  −3 points
  Parent
  Cupholder:
  
  That is an excellent illustration … of the many-worlds (or many-trials) case. Frequentist counting works fine for repeated situations.
  
  The one-shot case requires Bayesian thinking, not frequentist. The answer I gave is the correct one, because observers do not gain any information about whether the coin was heads or tails. The number of observers that see each result is not the same, but the only observers that actually see any result afterwards are the ones in either heads-world or tails-world; you can’t count them all as if they all exist.
  
  It would probably be easier for you to understand an equivalent situation: instead of a coin flip, we will use the 1 millionth digit of pi in binary notation. There is only one actual answer, but assume we don’t have the math skills and resources to calculate it, so we use Bayesian subjective probability.
  - JGWeissman 8 Apr 2010 4:57 UTC
    1 point
    Parent
    The one-shot case requires Bayesian thinking, not frequentist.
    
    Cupholder managed to find an analogous problem in which the Bayesian subjective probabilities mapped to the same values as frequentist probabilities, so that the frequentist approach really gives the same answer. Yes, it would be nice to just accept subjective probabilities so you don’t have to do that, but the answer Cupholder gave is correct.
    
    The analysis you label “Bayesian”, on the other hand, is incorrect. After you notice that you have survived the killing you should update your probability that coin showed tails to
    
    p(tails|survival) = p(tails) * p(survival|tails) / p(survival) = .5 * .01 / .5 = .01
    so you can then calculate
    
    "P(red|after)" = p(heads|survival) * "p(red|heads)" + p(tails|survival) * "p(red|tails)" = .99 * 0 + .01 * 1 = .01
    Or, as Academian suggested, you could have just updated to directly find
    
    p(red|survival) = p(red) * p(survival|red) / p(survival)
  - cupholder 8 Apr 2010 3:48 UTC
    0 points
    Parent
    
    The one-shot case requires Bayesian thinking, not frequentist.
    
    I disagree, but I am inclined to disagree by default: one of the themes that motivates me to post here is the idea that frequentist calculations are typically able to give precisely the same answer as Bayesian calculations.
    
    I also see no trouble with wearing my frequentist hat when thinking about single coin flips: I can still reason that if I flipped a fair coin arbitrarily many times, the relative frequency of a head converges almost surely to one half, and that relative frequency represents my chance of getting a head on a single flip.
    
    The answer I gave is the correct one, because observers do not gain any information about whether the coin was heads or tails.
    
    I believe that the observers who survive would. To clarify my thinking on this, I considered doing this experiment with a trillion doors, where one of the doors is again red, and all of the others blue. Let’s say I survive this huge version of the experiment.
    
    As a survivor, I know I was almost certainly behind a blue door to start with. Hence a tail would have implied my death with near certainty. Yet I’m not dead, so it is extremely unlikely that I got tails. That means I almost certainly got heads. I have gained information about the coin flip.
    
    The number of observers that see each result is not the same, but the only observers that actually see any result afterwards are the ones in either heads-world or tails-world; you can’t count them all as if they all exist.
    
    I think talking about ‘observers’ might be muddling the issue here. We could talk instead about creatures that don’t understand the experiment, and the result would be the same. Say we have two Petri dishes, one dish containing a single bacterium, and the other containing a trillion. We randomly select one of the bacteria (representing me in the original door experiment) to stain with a dye. We flip a coin: if it’s heads, we kill the lone bacterium, otherwise we put the trillion-bacteria dish into an autoclave and kill all of those bacteria. Given that the stained bacterium survives the process, it is far more likely that it was in the trillion-bacteria dish, so it is far more likely that the coin came up heads.
    
    It would probably be easier for you to understand an equivalent situation: instead of a coin flip, we will use the 1 millionth digit of pi in binary notation.
    
    I don’t think of the pi digit process as equivalent. Say I interpret ‘pi’s millionth bit is 0’ as heads, and ‘pi’s millionth bit is 1’ as tails. If I repeat the door experiment many times using pi’s millionth bit, whoever is behind the red door must die, and whoever’s behind the blue doors must survive. And that is going to be the case whether I ‘have the math skills and resources to calculate’ the bit or not. But it’s not going to be the case if I flip fair coins, at least as flipping a fair coin is generally understood in this kind of context.
    - JGWeissman 8 Apr 2010 4:32 UTC
      1 point
      Parent
      
      If I repeat the door experiment many times using pi’s millionth bit, whoever is behind the red door must die, and whoever’s behind the blue doors must survive.
      
      That would be like repeating the coin version of the experiment many times, using the exact same coin (in the exact same condition), flipping it in the exact same way, in the exact same environment. Even though you don’t know all these factors of the initial conditions, or have the computational power to draw conclusions from it, the coin still lands the same way each time.
      
      Since you are willing to suppose that these initial conditions are different in each trial, why not analogously suppose that in each trial of the digit of pi version of the experiment, that you compute a different digit of pi. or, more generally, that in each trial you compute a different logical fact that you were initially completely ignorant about.?
      - cupholder 8 Apr 2010 5:41 UTC
        0 points
        Parent
        
        Since you are willing to suppose that these initial conditions are different in each trial, why not analogously suppose that in each trial of the digit of pi version of the experiment, that you compute a different digit of pi.
        
        Yes, I think that would work—if I remember right, zeroes and ones are equally likely in pi’s binary expansion, so it would successfully mimic flipping a coin with random initial conditions. (ETA: this is interesting. Apparently pi’s not yet been shown to have this property. Still, it’s plausible.)
        
        or, more generally, that in each trial you compute a different logical fact that you were initially completely ignorant about.?
        
        This would also work, so long as your bag of facts is equally distributed between true facts and false facts.
    - Mallah 13 Apr 2010 4:04 UTC
      −1 points
      Parent
      
      I think talking about ‘observers’ might be muddling the issue here.
      
      That’s probably why you don’t understand the result; it is an anthropic selection effect. See my reply to Academician above.
      
      We could talk instead about creatures that don’t understand the experiment, and the result would be the same. Say we have two Petri dishes, one dish containing a single bacterium, and the other containing a trillion. We randomly select one of the bacteria (representing me in the original door experiment) to stain with a dye. We flip a coin: if it’s heads, we kill the lone bacterium, otherwise we put the trillion-bacteria dish into an autoclave and kill all of those bacteria. Given that the stained bacterium survives the process, it is far more likely that it was in the trillion-bacteria dish, so it is far more likely that the coin came up heads.
      
      That is not an analogous experiment. Typical survivors are not pre-selected individuals; they are post-selected, from the pool of survivors only. The analogous experiment would be to choose one of the surviving bacteria after the killing and then stain it. To stain it before the killing risks it not being a survivor, and that can’t happen in the case of anthropic selection among survivors.
      
      I don’t think of the pi digit process as equivalent.
      
      That’s because you erroneously believe that your frequency interpretation works. The math problem has only one answer, which makes it a perfect analogy for the 1-shot case.
      - cupholder 14 Apr 2010 5:56 UTC
        0 points
        Parent
        
        See my reply to Academician above.
        
        Okay.
        
        That is not an analogous experiment. Typical survivors are not pre-selected individuals; they are post-selected, from the pool of survivors only. The analogous experiment would be to choose one of the surviving bacteria after the killing and then stain it. To stain it before the killing risks it not being a survivor, and that can’t happen in the case of anthropic selection among survivors.
        
        I believe that situations A and B which you quote from Stuart_Armstrong’s post involve pre-selection, not post-selection, so maybe that is why we disagree. I believe that because the descriptions of the two situations refer to ‘you’ - that is, me—which makes me construct a mental model of me being put into one of the 100 rooms at random. In that model my pre-selected consciousness is at issue, not that of a post-selected survivor.
        
        That’s because you erroneously believe that your frequency interpretation works. The math problem has only one answer, which makes it a perfect analogy for the 1-shot case.
        
        By ‘math problem’ do you mean the question of whether pi’s millionth bit is 0? If so, I disagree. The 1-shot case (which I think you are using to refer to situation B in Stuart_Armstrong’s top-level post...?) describes a situation defined to have multiple possible outcomes, but there’s only one outcome to the question ‘what is pi’s millionth bit?’
        Mallah 14 Apr 2010 15:28 UTC
        1 point
        Parent
        
        A few minutes later, it is announced that whoever was to be killed has been killed. What are your odds of being blue-doored now?
        
        Presumably you heard the announcement.
        
        This is post-selection, because pre-selection would have been “Either you are dead, or you hear that whoever was to be killed has been killed. What are your odds of being blue-doored now?”
        
        The 1-shot case (which I think you are using to refer to situation B in Stuart_Armstrong’s top-level post...?) describes a situation defined to have multiple possible outcomes, but there’s only one outcome to the question ‘what is pi’s millionth bit?’
        
        There’s only one outcome in the 1-shot case.
        
        The fact that there are multiple “possible” outcomes is irrelevant—all that means is that, like in the math case, you don’t have knowledge of which outcome it is.
        cupholder 14 Apr 2010 20:28 UTC
        0 points
        Parent
        
        Presumably you heard the announcement.
        
        This is post-selection, because pre-selection would have been “Either you are dead, or you hear that whoever was to be killed has been killed. What are your odds of being blue-doored now?”
        
        The ‘selection’ I have in mind is the selection, at the beginning of the scenario, of the person designated by ‘you’ and ‘your’ in the scenario’s description. The announcement, as I understand it, doesn’t alter the selection in the sense that I think of it, nor does it generate a new selection: it just indicates that ‘you’ happened to survive.
        
        The fact that there are multiple “possible” outcomes is irrelevant—all that means is that, like in the math case, you don’t have knowledge of which outcome it is.
        
        I continue to have difficulty accepting that the millionth bit of pi is just as good a random bit source as a coin flip. I am picturing a mathematically inexperienced programmer writing a (pseudo)random bit-generating routine that calculated the millionth digit of pi and returned it. Could they justify their code by pointing out that they don’t know what the millionth digit of pi is, and so they can treat it as a random bit?
        thomblake 14 Apr 2010 20:51 UTC
        1 point
        Parent
        
        I continue to have difficulty accepting that the millionth bit of pi is just as good a random bit source as a coin flip. I am picturing a mathematically inexperienced programmer writing a (pseudo)random bit-generating routine that calculated the millionth digit of pi and returned it. Could they justify their code by pointing out that they don’t know what the millionth digit of pi is, and so they can treat it as a random bit?
        
        Not seriously: http://www.xkcd.com/221/
        
        Seriously: You have no reason to believe that the millionth bit of pi goes one way or the other, so you should assign equal probability to each.
        
        However, just like the xkcd example would work better if the computer actually rolled the die for you every time rather than just returning ‘4’, the ‘millionth bit of pi’ algorithm doesn’t work well because it only generates a random bit once (amongst other practical problems).
        
        In most pseudorandom generators, you can specify a ‘seed’ which will get you a fixed set of outputs; thus, you could every time restart the generator with the seed that will output ‘4’ and get ‘4’ out of it deterministically. This does not undermine its ability to be a random number generator. One common way to seed a random number generator is to simply feed it the current time, since that’s as good as random.
        
        Looking back, I’m not certain if I’ve answered the question.
        cupholder 14 Apr 2010 22:53 UTC
        0 points
        Parent
        
        Looking back, I’m not certain if I’ve answered the question.
        
        I think so: I’m inferring from your comment that the principle of indifference is a rationale for treating a deterministic-but-unknown quantity as a random variable. Which I can’t argue with, but it still clashes with my intuition that any casino using the millionth bit of pi as its PRNG should expect to lose a lot of money.
        
        I agree with your point on arbitrary seeding, for whatever it’s worth. Selecting an arbitrary bit of pi at random to use as a random bit amounts to a coin flip.
        wedrifid 14 Apr 2010 21:27 UTC
        0 points
        Parent
        
        I am picturing a mathematically inexperienced programmer writing a (pseudo)random bit-generating routine that calculated the millionth digit of pi and returned it.
        
        I’d be extremely impressed if a mathematically inexperienced programmer could pull of a program that calculated the millionth digit of pi!
        
        Could they justify their code by pointing out that they don’t know what the millionth digit of pi is, and so they can treat it as a random bit?
        
        I say yes (assuming they only plan on treating it as a random bit once!)
        Mallah 15 Apr 2010 18:07 UTC
        0 points
        Parent
        
        The ‘selection’ I have in mind is the selection, at the beginning of the scenario, of the person designated by ‘you’ and ‘your’ in the scenario’s description.
        
        If ‘you’ were selected at the beginning, then you might not have survived.
        cupholder 15 Apr 2010 19:14 UTC
        0 points
        Parent
        
        If ‘you’ were selected at the beginning, then you might not have survived.
        
        Yeah, but the description of the situation asserts that ‘you’ happened to survive.
        Mallah 15 Apr 2010 20:38 UTC
        −1 points
        Parent
        Adding that condition is post-selection.
        
        Note that “If you (being asked before the killing) will survive, what color is your door likely to be?” is very different from “Given that you did already survive, …?”. A member of the population to which the first of these applies might not survive. This changes the result. It’s the difference between pre-selection and post-selection.
        cupholder 15 Apr 2010 23:14 UTC
        0 points
        Parent
        I’ll try to clarify what I’m thinking of as the relevant kind of selection in this exercise. It is true that the condition effectively picks out—that is, selects—the probability branches in which ‘you’ don’t die, but I don’t see that kind of selection as relevant here, because (by my calculations, if not your own) it has no impact on the probability of being behind a blue door.
        
        What sets your probability of being behind a blue door is the problem specifying that ‘you’ are the experimental subject concerned: that gives me the mental image of a film camera, representing my mind’s eye, following ‘you’ from start to finish - ‘you’ are the specific person who has been selected. I don’t visualize a camera following a survivor randomly selected post-killing. That is what leads me to think of the relevant selection as happening pre-killing (hence ‘pre-selection’).
        Expand this thread
        Mallah 16 Apr 2010 15:46 UTC
        0 points
        Parent
        If that were the case, the camera might show the person being killed; indeed, that is 50% likely.
        
        Pre-selection is not the same as our case of post-selection. My calculation shows the difference it makes.
        
        Now, if the fraction of observers of each type that are killed is the same, the difference between the two selections cancels out. That is what tends to happen in the many-shot case, and we can then replace probabilities with relative frequencies. One-shot probability is not relative frequency.
        cupholder 16 Apr 2010 21:09 UTC
        0 points
        Parent
        
        If that were the case, the camera might show the person being killed; indeed, that is 50% likely.
        
        Yep. But Stuart_Armstrong’s description is asking us to condition on the camera showing ‘you’ surviving.
        
        Pre-selection is not the same as our case of post-selection. My calculation shows the difference it makes.
        
        It looks to me like we agree that pre-selecting someone who happens to survive gives a different result (99%) to post-selecting someone from the pool of survivors (50%) - we just disagree on which case SA had in mind. Really, I guess it doesn’t matter much if we agree on what the probabilities are for the pre-selection v. the post-selection case.
        
        Now, if the fraction of observers of each type that are killed is the same, the difference between the two selections cancels out. That is what tends to happen in the many-shot case, and we can then replace probabilities with relative frequencies.
        
        I am unsure how to interpret this...
        
        One-shot probability is not relative frequency.
        
        ...but I’m fairly sure I disagree with this. If we do Bernoulli trials with success probability p (like coin flips, which are equivalent to Bernoulli trials with p = 0.5), I believe the strong law of large numbers implies that the relative frequency converges almost surely to p as the number of Bernoulli trials becomes arbitrarily large. As p represents the ‘one-shot probability,’ this justifies interpreting the relative frequency in the infinite limit as the ‘one-shot probability.’
        Mallah 18 Apr 2010 16:35 UTC
        0 points
        Parent
        
        But Stuart_Armstrong’s description is asking us to condition on the camera showing ‘you’ surviving.
        
        That condition imposes post-selection.
        
        I guess it doesn’t matter much if we agree on what the probabilities are for the pre-selection v. the post-selection case.
        
        Wrong—it matters a lot because you are using the wrong probabilities for the survivor (in practice this affects things like belief in the Doomsday argument).
        
        I believe the strong law of large numbers implies that the relative frequency converges almost surely to p as the number of Bernoulli trials becomes arbitrarily large. As p represents the ‘one-shot probability,’ this justifies interpreting the relative frequency in the infinite limit as the ‘one-shot probability.’
        
        You have things backwards. The “relative frequency in the infinite limit” can be defined that way (sort of, as the infinite limit is not actually doable) and is then equal to the pre-defined probability p for each shot if they are independent trials. You can’t go the other way; we don’t have any infinite sequences to examine, so we can’t get p from them, we have to start out with it. It’s true that if we have a large but finite sequence, we can guess that p is “probably” close to our ratio of finite outcomes, but that’s just Bayesian updating given our prior distribution on likely values of p. Also, in the 1-shot case at hand, it is crucial that there is only the 1 shot.
        cupholder 20 Apr 2010 22:12 UTC
        0 points
        Parent
        
        That condition imposes post-selection.
        
        But not post-selection of the kind that influences the probability (at least, according to my own calculations).
        
        Wrong—it matters a lot because you are using the wrong probabilities for the survivor (in practice this affects things like belief in the Doomsday argument).
        
        Which of my estimates is incorrect—the 50% estimate for what I call ‘pre-selecting someone who happens to survive,’ the 99% estimate for what I call ‘post-selecting someone from the pool of survivors,’ or both?
        
        You can’t go the other way; we don’t have any infinite sequences to examine, so we can’t get p from them, we have to start out with it.
        
        Correct. p, strictly, isn’t defined by the relative frequency—the strong law of large numbers simply justifies interpreting it as a relative frequency. That’s a philosophical solution, though. It doesn’t help for practical cases like the one you mention next...
        
        It’s true that if we have a large but finite sequence, we can guess that p is “probably” close to our ratio of finite outcomes, but that’s just Bayesian updating given our prior distribution on likely values of p.
        
        ...for practical scenarios like this we can instead use the central limit theorem to say that p’s likely to be close to the relative frequency. I’d expect it to give the same results as Bayesian updating—it’s just that the rationale differs.
        
        Also, in the 1-shot case at hand, it is crucial that there is only the 1 shot.
        
        It certainly is in the sense that if ‘you’ die after 1 shot, ‘you’ might not live to take another!
  - wnoise 7 Apr 2010 21:36 UTC
    0 points
    Parent
    FWIW, it’s not that hard to calculate binary digits of pi:
    
    http://oldweb.cecm.sfu.ca/projects/pihex/index.html
    
    The Quadrillionth Bit of Pi is ‘0’! The Forty Trillionth Bit of Pi is ‘0’! The Five Trillionth Bit of Pi is ‘0’!
    
    I think I’ll go calculate the millionth, and get back to you.
    
    EDIT: also turns out to be 0.