# Forcing Anthropics: Boltzmann Brains

Followup to: Anthropic Reasoning in UDT by Wei Dai

Suppose that I flip a logical coin—e.g. look at some binary digit of pi unknown to either of us—and depending on the result, either create a billion of you in green rooms and one of you in a red room if the coin came up 1; or, if the coin came up 0, create one of you in a green room and a billion of you in red rooms. You go to sleep at the start of the experiment, and wake up in a red room.

Do you reason that the coin very probably came up 0? Thinking, perhaps: “If the coin came up 1, there’d be a billion of me in green rooms and only one of me in a red room, and in that case, it’d be very surprising that I found myself in a red room.”

What is your degree of subjective credence—your posterior probability—that the logical coin came up 1?

There are only two answers I can see that might in principle be coherent, and they are “50%” and “a billion to one against”.

Tomorrow I’ll talk about what sort of trouble you run into if you reply “a billion to one”.

But for today, suppose you reply “50%”. Thinking, perhaps: “I don’t understand this whole consciousness rigamarole, I wouldn’t try to program a computer to update on it, and I’m not going to update on it myself.”

In that case, why don’t you believe you’re a Boltzmann brain?

Back when the laws of thermodynamics were being worked out, there was first asked the question: “Why did the universe seem to start from a condition of low entropy?” Boltzmann suggested that the larger universe was in a state of high entropy, but that, given a long enough time, regions of low entropy would spontaneously occur—wait long enough, and the egg will unscramble itself—and that our own universe was such a region.

The problem with this explanation is now known as the “Boltzmann brain” problem; namely, while Hubble-region-sized low-entropy fluctuations will occasionally occur, it would be far more likely—though still not likely in any absolute sense—for a handful of particles to come together in a configuration performing a computation that lasted just long enough to think a single conscious thought (whatever that means) before dissolving back into chaos. A random reverse-entropy fluctuation is exponentially vastly more likely to take place in a small region than a large one.

So on Boltzmann’s attempt to explain the low-entropy initial condition of the universe as a random statistical fluctuation, it’s far more likely that we are a little blob of chaos temporarily hallucinating the rest of the universe, than that a multi-billion-light-year region spontaneously ordered itself. And most such little blobs of chaos will dissolve in the next moment.

“Well,” you say, “that may be an unpleasant prediction, but that’s no license to reject it.” But wait, it gets worse: The vast majority of Boltzmann brains have experiences much less ordered than what you’re seeing right now. Even if a blob of chaos coughs up a visual cortex (or equivalent), that visual cortex is unlikely to see a highly ordered visual field—the vast majority of possible visual fields more closely resemble “static on a television screen” than “words on a computer screen”. So on the Boltzmann hypothesis, highly ordered experiences like the ones we are having now, constitute an exponentially infinitesimal fraction of all experiences.

In contrast, suppose one more simple law of physics not presently understood, which forces the initial condition of the universe to be low-entropy. Then the exponentially vast majority of brains occur as the result of ordered processes in ordered regions, and it’s not at all surprising that we find ourselves having ordered experiences.

But wait! This is just the same sort of logic (is it?) that one would use to say, “Well, if the logical coin came up heads, then it’s very surprising to find myself in a red room, since the vast majority of people-like-me are in green rooms; but if the logical coin came up tails, then most of me are in red rooms, and it’s not surprising that I’m in a red room.”

If you reject that reasoning, saying, “There’s only one me, and that person seeing a red room does exist, even if the logical coin came up heads” then you should have no trouble saying, “There’s only one me, having a highly ordered experience, and that person exists even if all experiences are generated at random by a Boltzmann-brain process or something similar to it.” And furthermore, the Boltzmann-brain process is a much simpler process—it could occur with only the barest sort of causal structure, no need to postulate the full complexity of our own hallucinated universe. So if you’re not updating on the apparent conditional rarity of having a highly ordered experience of gravity, then you should just believe the very simple hypothesis of a high-volume random experience generator, which would necessarily create your current experiences—albeit with extreme relative infrequency, but you don’t care about that.

Now, doesn’t the Boltzmann-brain hypothesis also predict that reality will dissolve into chaos in the next moment? Well, it predicts that the vast majority of blobs who experience this moment, cease to exist after; and that among the few who don’t dissolve, the vast majority of those experience chaotic successors. But there would be an infinitesimal fraction of a fraction of successors, who experience ordered successor-states as well. And you’re not alarmed by the rarity of those successors, just as you’re not alarmed by the rarity of waking up in a red room if the logical coin came up 1 - right?

So even though your friend is standing right next to you, saying, “I predict the sky will not turn into green pumpkins and explode—oh, look, I was successful again!”, you are not disturbed by their unbroken string of successes. You just keep on saying, “Well, it was necessarily true that someone would have an ordered successor experience, on the Boltzmann-brain hypothesis, and that just happens to be us, but in the next instant I will sprout wings and fly away.”

Now this is not quite a logical contradiction. But the total rejection of all science, induction, and inference in favor of an unrelinquishable faith that the next moment will dissolve into pure chaos, is sufficiently unpalatable that even I decline to bite that bullet.

And so I still can’t seem to dispense with anthropic reasoning—I can’t seem to dispense with trying to think about how many of me or how much of me there are, which in turn requires that I think about what sort of process constitutes a me. Even though I confess myself to be sorely confused, about what could possibly make a certain computation “real” or “not real”, or how some universes and experiences could be quantitatively realer than others (possess more reality-fluid, as ’twere), and I still don’t know what exactly makes a causal process count as something I might have been for purposes of being surprised to find myself as me, or for that matter, what exactly is a causal process.

Indeed this is all greatly and terribly confusing unto me, and I would be less confused if I could go through life while only answering questions like “Given the Peano axioms, what is SS0 + SS0?”

But then I have no defense against the one who says to me, “Why don’t you think you’re a Boltzmann brain? Why don’t you think you’re the result of an all-possible-experiences generator? Why don’t you think that gravity is a matter of branching worlds in which all objects accelerate in all directions and in some worlds all the observed objects happen to be accelerating downward? It explains all your observations, in the sense of logically necessitating them.”

I want to reply, “But then most people don’t have experiences this ordered, so finding myself with an ordered experience is, on your hypothesis, very surprising. Even if there are some versions of me that exist in regions or universes where they arose by chaotic chance, I anticipate, for purposes of predicting my future experiences, that most of my existence is encoded in regions and universes where I am the product of ordered processes.”

And I currently know of no way to reply thusly, that does not make use of poorly defined concepts like “number of real processes” or “amount of real processes”; and “people”, and “me”, and “anticipate” and “future experience”.

Of course confusion exists in the mind, not in reality, and it would not be the least bit surprising if a resolution of this problem were to dispense with such notions as “real” and “people” and “my future”. But I do not presently have that resolution.

(Tomorrow I will argue that anthropic updates must be illegal and that the correct answer to the original problem must be “50%”.)

• Necromancy, but: easy. Boltzmann brains obey little or no causality, and thus cannot possibly benefit from rationality. As such, rationality is wasted on them. Optimize for the signal, not for the noise.

• What is your degree of subjective credence—your posterior probability—that the logical coin came up 1?

. . .

(Tomorrow I will argue that anthropic updates must be illegal and that the correct answer to the original problem must be “50%”.)

If the question was, “What odds should you bet at?”, it could be answered using your values. Suppose each copy of you has \$1000, and copies of you in a red room are offered a bet that costs \$1000 and pays \$1001 if the Nth bit of pi is 0. Which do you prefer:

• To refuse the bet?

• With 50% subjective logical probability, the Nth bit of pi will be 0 and you will have \$1,000 per copy.

• With 50% subjective logical probability, the Nth bit of pi will be 1 and you will have \$1,000 per copy.

• To take the bet?

• With 50% subjective logical probability, the Nth bit of pi will be 0 and you will have \$ 1,000.999 999 999 per copy.

• With 50% subjective logical probability, the Nth bit of pi will be 1 and you will have \$ 999.999 999 per copy.

But the question is “What is your posterior probability”? This is not a decision problem, so I don’t know that it has an answer.

I think it may be natural to ask instead: “Given that your learned cognitive system of rational prediction is competing for influence over anticipations used in making decisions, in a brain which awards influence over anticipation to different cognitive systems depending on the success of their past reported predictions, which probability should your rational prediction system report to the brain’s anticipation-influence-awarding mechanisms?”

Suppose you know the following:

• Your brain will use a simple Bayesian mechanism which will treat cognitive systems as hypotheses and award influence using Bayesian updating.

• In the future, the competitor cognitive systems to your rational prediction system will make predictions which will cause you to take harmful actions. The less influential the competitor systems will be, the less harmful the actions will be.

• The competitor cognitive systems will predict 1:1 probabilities of the experiences of being informed that the Nth digit of pi is 0 or 1.

This question could be answered using your values. Which would you prefer:

• In both green rooms and red rooms, to rationally predict 1:1 probabilities of the experiences of being informed that the Nth bit of pi is 0 or 1?

• With 50% subjective logical probability, the Nth bit of pi will be 0. There will be 1,000,000,001 copies of you whose learned cognitive systems for rational prediction took a likelihood hit of 12. The competitor cognitive systems will also have taken a likelihood hit of 12. The relative influences of the cognitive systems will not change.

• With 50% subjective logical probability, the Nth bit of pi will be 1. There will be 1,000,000,001 copies of you whose learned cognitive systems for rational prediction took a likelihood hit of 12. The competitor cognitive systems will also have taken a likelihood hit of 12. The relative influences of the cognitive systems will not change.

• In red rooms, to rationally predict a 1,000,000,000:1 probability of the experience of being informed that the Nth bit of pi is 0, and in green rooms, to rationally predict a 1,000,000,000:1 probability of the experience of being informed that the Nth bit of pi is 1?

• With 50% subjective logical probability, the Nth bit of pi will be 0. There will be 1,000,000,000 copies of you who woke up in red rooms, whose learned cognitive systems for rational prediction took a tiny 1,000,000,0001,000,000,001 likelihood hit. The competitor cognitive systems will have taken a likelihood hit of 12. In those 1,000,000,000 copies, the relative influences of the cognitive systems will be adjusted by the ratio 2,000,000,000:1,000,000,001. There will also be one copy of you who woke up in a green room, whose learned cognitive systems for rational prediction took a likelihood hit of 11,000,000,001. In that copy, the relative influences of the cognitive systems will be adjusted by the ratio 2:1,000,000,001.

• With 50% subjective logical probability, the Nth bit of pi will be 1. There will be one copy of you who woke up in a red room, whose learned cognitive systems for rational prediction took a likelihood hit of 11,000,000,001. The competitor cognitive systems will have taken a likelihood hit of 12. In that copy, the relative influences of the cognitive systems will be adjusted by the ratio 2:1,000,000,001. There will also be 1,000,000,000 copies of you who woke up in a green room, whose learned cognitive systems for rational prediction took a tiny 1,000,000,0001,000,000,001 likelihood hit. In those 1,000,000,000 copies, the relative influences of the cognitive systems will be adjusted by the ratio 2,000,000,000:1,000,000,001.

The answer depends on the starting relative influences and on the details of the function from amounts of non-rational anticipation to amounts of harm. But for perspective, the ratio 2:1,000,000,001 can be reversed with 29.9 copies of the ratio 2,000,000,000:1,000,000,001.

If your copies are being merged, the optimal “rational” prediction would depend on the details of the merging algorithm. If the merging algorithm took the arithmetic mean of the updated influences, the optimal prediction would still depend on the starting relative influences and the harm from non-rational anticipations. But if the merging algorithm multiplicatively combined the likelihood ratios from every copy’s predictions, then the second prediction rule would be optimal.

To make decisions about how to value possibly logically impossible worlds, it may help to imagine that the decision problem will be iterated with the (N+1)th digit of pi, the (N+2)th bit, …

(If the rational prediction system already has complete control of your brain’s anticipations, then there may be no reason to predict anything that does not affect a decision.)

• I agree with Steve; we have to take a step back and ask not for probabilities but for decision algorithms that aim to achieve certain goals, then it all makes sense; it has to—based upon materialism, whatever definition of “you” you try to settle upon, “you” is some set of physical objects that behave according to a certain decision algorithm, and given the decision algorithm, “you” will have a well-defined expected future reward.

• 9 Sep 2009 6:20 UTC
−4 points
Parent

Let me suggest that for anthropic reasoning, you are not directly calculating expected utility but actually trying to determine priors instead. And this traces back to Occam’s razor and hence complexity measures (complexity prior). Further, it is not probabilities that you are trying to directly manipulate, but degrees of similarity. (i.e which reference class does a given observer fall into? – what is the degree of similarity between given algorithms?). So rather than utility and probability, you are actually trying to manipulate something more basic , i.e., complexity and similarity measures

Suggested analogy:

Complexity (is like) Utility Similarity (is like ) Probability

Let me suggest that rather than trying to ‘maximize utility’ directly, you should first attempt to ‘minimize complexity’ using a new generalized new form of rationality based on the above analogy (The putative method would be an entirely new type of rationality which subsumes ordinary Bayesian reasoning as a special case). The ‘expected complexity’ (analogous to ‘expected utility’) would be based on a ‘complexity function’ (analogous to ‘utility function’) that combines similarity measures (similiarities between algorithms) with the complexities of given outcomes. The utilities and probabilities would be derived from these calculations (ordinary Bayesian rationality would be derivative rather than fundamental).

• M J Geddes (Black Swan Siren!)

• The skeleton of the argument is:

1. Present a particular thought experiment, intended to provoke anthropic reasoning. There are two moderately plausible answers, “50%” and “a billion to one against”.

2. Assume for the sake of argument, the answer to the thought experiment is 50%. Note that the “50%” answer corresponds to ignoring the color of the room—“not updating on it” in the Bayesian jargon.

3. The thought experiment is analogous to the Bolzmann-brain hypothesis. In particular, the color of the room corresponds to the ordered-ness of our experiences.

4. With the exception of the ordered-ness of our experiences, a stochastic-all-experience-generator would be consistent with all observations.

5. Occam’s Razor: Use the simplest possible hypothesis consistent with observations.

6. A stochastic-all-experience-generator would be a simple hypothesis.

7. From 3, 4, 5, and 6, predict that the universe is a stochastic all-experience generator.

8. From 7, some very unpleasant consequences.

9. From 8, reject the assumption.

I think the argument can be improved.

According to the minimum description length notion of science, we have a model and a sequence of observations. A “better” model is one that is short and compresses the observations well. The stochastic-all-experience-generator is a short model, but it doesn’t compress our observations. I think this is basically saying that according to the MDL version of Occam’s Razor, 6 is false.

The article claims that the stochastic-all-experience-generator is a “simple” model of the world and would defeat more common-sense models of the world in an Occam’s Razor-off in the absence of some sort of anthropic defense. That claim (6) might be true, but it needs more support.

• Isn’t the argument in one false? If one applies bayes’ theorem, with initial prob. 50% and new likelihood ratio of a billion to one, don’t you get 500000000 to one chances?

• I think you may be sincerely confused. Would you please reword your question?

If your question is whether someone (either me or the OP) has committed a multiplication error—yes, it’s entirely possible, but multiplication is not the point—the point is anthropic reasoning and whether “I am a Bolzmann brain” is a simple hypothesis.

• I Agree very much.

It reminds me of one remark of Eliezer in his diavlog with Scott about the multiple world interpretation of QM. There he also said something to the effect that Occam’s razor is only about the theory, but not about the “amount of stuff”.

I think that was the same fallacy. When Using MDL, you have to give a short description for your actual observation history, or at least give an upper bound for the compressed length. In multiple world theories these bounds can become very nontrivial, and the observations can easily dominate the description length, therefore Occam’s razor cannot be applied without thorough quantitative analysis.

Of course, in that special context it was true that a random state-reduction is not better than a multiple world hypothesis, in fact: slightly worse. However, one should add, a deterministic (low complexity) state reduction would be far superior.

Regardless: such lighthearted remarks about the “amount of stuff” in Occam’s razor are misleading at least.

• “That claim (6) might be true, but it needs more support.” Agreed.

• It seems to me that “I’m a Bolzmann brain” is exactly the same sort of useless hypothesis as “Everything I think I experience is a hallucination manufactured by an omnipotent evil genie”. They’re both non-falsifiable by definition, unsupported by any evidence, and have no effect on one’s decisions in any event. So I say: show me some evidence, and I’ll worry about it. Otherwise it isn’t even worth thinking about.

• I would have answered 1B:1 (looking forward to the second post to be proved wrong), however I think a rational agent should never believe in the Boltzmann brain scenario regardless.

Not because it is not a reasonable hypothesis, but since it negates the agent’s capabilities of estimating prior probabilities (it cannot trust even a predetermined portion of its memories) plus it also makes optimizing outcomes a futile undertaking.

Therefore, I’d generally say that an agent has to assume an objective, causal reality as a precondition of using decision theory at all.

• But for today, suppose you reply “50%”. Thinking, perhaps: “I don’t understand this whole consciousness rigamarole, I wouldn’t try to program a computer to update on it, and I’m not going to update on it myself.”

In that case, why don’t you believe you’re a Boltzmann brain?

This sounds backwards (sideways?); the reason to (strongly) believe one is a Boltzmann brain is that there are very many of them in some weighting compared to the “normal” you, which corresponds to accepting probability of 1 to the billion in this thought experiment. If you don’t update, then the other billion people are (epistemically) irrelevant, and in exactly the same way so are Boltzmann brains. It doesn’t at all matter how many visual cortexes spontaneously form in the Chaos.

In other words, there are two parts to not updating: you can’t place a greater weight on particular states of the world, arguing that this particular kind of situations is privileged, but at the same time you can’t be disturbed by an argument that there is huge weight on that other class of crazy situations which leave your privileged situation far behind. You can’t refute the assertion that you are a Boltzmann brain, but you are undisturbed by the assertion that there are Boltzmann brains.

Of course, in all cases some situations may be preferentially privileged. You don’t care about what happens to a Boltzmann brain, or more likely just can’t do much for it anyway. In the rooms with a billion copies, you may care about whether only one person makes a mistake, or a whole billion of them (total utilitarianism). But that’s utility of the situation, not probability, and the construction of the thought experiment clearly doesn’t try to make utility symmetrical, hence the skewed intuition.

The confusion between probability and utility seems to explain the intuition: weighting is there, just not in the probability, and in fact it can’t be represented as probability (in which case the weighting is not so much in utility, since there is no anthropic utility just as there is no anthropic probability, but in how the global preference responds to actions performed in particular situations).

• The problem is that if you don’t update on the proportions of sentients who have your particular experience, then there are much simpler hypotheses than our current physical model which would generate and “explain” your experiences, namely, “Every experience happens within the dust.”

To put it another way, the dust hypothesis is extremely simple and explains why this experience exists. It just doesn’t explain why an ordered experience instead of a disordered one, when ordered experiences are such a tiny fraction of all experiences. If you think the latter is a non-consideration then you should just go with the simplest explanation.

• Traditional explanations are for updating; this is probably a relevant tension. If you don’t update, you can’t explain in the sense of updating. The notion of explanation itself has to be revised in this light.

• Are the Boltzmann brain hypothesis and the dust hypothesis really simpler than the standard model of the universe, in the sense of Occam’s razor? It seems to me that it isn’t.

I’m thinking specifically about Solomonoff induction here. A Boltzmann brain hypothesis would be a program that correctly predicts all my experiences up to now, and then starts predicting unrelated experiences. Such a program of minimal length would essentially emulate the standard model until output N, and then start doing something else. So it would be longer than the standard model by however many bits it takes to encode the number N.

• [Rosencrantz has been flipping coins, and all of them are coming down heads]

Guildenstern: Consider: One, probability is a factor which operates within natural forces. Two, probability is not operating as a factor. Three, we are now held within un-, sub- or super-natural forces. Discuss.

Rosencrantz: What?

Rosencrantz & Guildenstern Are Dead, Tom Stoppard

• The Boltzmann brain argument was the reason why I had not adopted something along the lines of UDT, despite having considered it and discussed it a bit with others, before the recent LW discussion. Instead, I had tagged it as ‘needs more analysis later.’ After the fact, that looks like flinching to me.

• 9 Nov 2009 22:52 UTC
3 points

Here, let me re-respond to this post.

So if you’re not updating on the apparent conditional rarity of having a highly ordered experience of gravity, then you should just believe the very simple hypothesis of a high-volume random experience generator, which would necessarily create your current experiences—albeit with extreme relative infrequency, but you don’t care about that.

“A high-volume random experience generator” is not a hypothesis. It’s a thing. “The universe is a high-volume random experience generator” is better, but still not okay for Bayesian updating, because we don’t observe “the universe”. “My observations are output by a high-volume random experience generator” is better still, but it doesn’t specify which output our observations are. “My observations are the output at [...] by a high-volume random experience generator” is a specific, updatable hypothesis—and its entropy is so high that it’s not worth considering.

Did I just use anthropic reasoning?

Let’s apply this to the hotel problem. There are two specific hypotheses: “My observations are what they were before except I’m now in green room #314159265″ (or whatever green room) and ”. . . except I’m now in the red room”. It appears that the thing determining probability is not multiplicity but complexity of the “address”—and, counterintuitively, this makes the type of room only one of you is in more likely than the type of room a billion of you are in.

Yes, I’m taking into account that “I’m in a green room” is the disjunction of one billion hypotheses and therefore has one billion times the probability of any of them. In order for one’s priors to be well-defined, then for infinitely many N, all hypotheses of length N+1 together must be less likely than all hypotheses of length N together.

Edit: changed “more likely” to “less likely” (oops) and “large N” to “infinitely many N”, as per pengvado. Thanks!

This post in seventeen words: it’s the high multiplicity of brains in the Boltzmann brain hypothesis, not their low frequency, that matters.

Let the poking of holes into this post begin!

• “My observations are the output at [...] by a high-volume random experience generator”

“My observations are [...], which were output by a high-volume random experience generator”. Since the task is to explain my observations, not to predict where I am. This way also makes it more clear that that suffix is strictly superfluous from a Kolmogorov perspective.

In order for one’s priors to be well-defined, then for large N, all hypotheses of length N+1 together must be more likely than all hypotheses of length N together.

You mean less likely. i.e. there is no nonnegative monotonic-increasing infinite series whose sum is finite. Also, it need not happen for all large N, just some of them. So I would clarify it as: ∀L ∃N>L ∀M>N (((sum of probabilities of hypotheses of length M) < (sum of probabilities of hypotheses of length N)) or (both are zero)).

But you shouldn’t take that into account for your example. The theorem applies to infinite sequences of hypotheses, but not to any one finite hypothesis such as the disjunction of a billion green rooms. To get conclusions about a particular hypothesis, you need more than “any prior is Occam’s razor with respect to a sufficiently perverse complexity metric”.

• “My observations are [...], which were output by a high-volume random experience generator”. Since the task is to explain my observations, not to predict where I am. This way also makes it more clear that that suffix is strictly superfluous from a Kolmogorov perspective.

You are correct, though I believe your statement is equivalent to mine.

You mean less likely. i.e. there is no nonnegative monotonic-increasing infinite series whose sum is finite. Also, it need not happen for all large N, just some of them. So I would clarify it as: ∀L ∃N>L ∀M>N (((sum of probabilities of hypotheses of length M) < (sum of probabilities of hypotheses of length N)) or (both are zero)).

Right again; I’ll fix my post.

• I think we need to reduce “surprise” and “explanation” first. I suggest they have to do with bounded rationality and logical uncertainty. These concepts don’t seem to exist in decision theories with logical omniscience.

Surprise seems to be the output of some heuristic that tell you when you may have made a cognitive error or taken a computational shortcut that turns out to be wrong (i.e., you find yourself in a situation where you had previously computed to have low probability) and should go back and recheck your logic. After you’ve found such an error and have fixed it, perhaps you call the fix an explanation (i.e., it “explains” why the low computed probability was an error).

In UDT, there ought to be equivalents of surprise and explanation, although I’m too tired to think of them right now. I’ll try again later.

• 8 Sep 2009 15:47 UTC
3 points

Suppose Omega plays the following game (the “Probability Game”) with me: You will tell me a number X representing the probability of A. If A turns out to be true, I will increase your utility by ln(X); otherwise, I will increase your utility by ln(1-X). It’s well-known that the way one maximizes one’s expected utility is by giving their actual expected probability of X.

Presumably, decision mechanisms should be consistent under reflection. Even if not, if I somehow know that Omega’s going to split me into 1,000,000,001 copies and do this, I want to modify my decision mechanism to do what I think is best.

Suppose I care about the entire group of 1,000,000,000 me’s who go into one color of room precisely as much as I care about the single me who goes into the other color. (Perhaps I’m extending the idea that two copies of one person should not be more deserving than a single copy of the person.) In order to maximize the average utility here, I should have everyone declare a 50% probability of the best answer, resulting in an average utility of about −0.69. If I had everyone declare a 1,000,000,000-in-1,000,000,001 probability, the average utility would be about −10.

Suppose, on the other hand, that I care about each individual person equally. If I had everyone declare a 50% probability, the average utility would still be −0.69, but if I had everyone declare a 1,000,000,000-in-1,000,000,001 probability, the average utility would go all the way up to −0.000000022.

One’s answer to the Probability Game is one’s probability estimate. The consistent-under-reflection answer to the Probability Game depends on one’s values. Therefore, one’s probability estimate depends on one’s values. It’s counterintuitive, but I don’t think I can argue against it.

Now, here’s, perhaps, a refutation. Suppose I know that some time in the future, I’m going to be turned into my evil twin, Dr. Dingo, and Omega is going to play the Probability Game with me on the statement “The sky is blue”. I hate my evil twin so much that I consider my utility to have his utility subtracted from it. Therefore, I modify myself to say that the probability that the sky is blue is 0, thereby resulting a utility for him of negative infinity, and a utility for me of infinity. Through the same mechanism—using an interpretation function to determine my utility given the utilities of future copies of me—I apparently make the probability that the sky is blue be 0. This doesn’t seem right.

Perhaps we could require that interpretation functions be monotonically related to the utilities they’re interpreting, so that an increase in a future me’s utility can’t decrease my current me’s utility. I don’t know if that would work.

• (Missing word alert in paragraph 11: “Even [if] a blob of chaos coughs up a visual cortex (or equivalent)...”.)

• thx fixed

• In the criticisim of Boltzman, entropy sounds like a radio dial that someone is tweaking rather than a property of some space. I may be misunderstanding something.

Basically, if some tiny part of some enormous universe happened to condense into a very low-entropy state, that does not mean that it could spontaneously jump to a high-entropy state. It would, with extremely high probability, slowly return to a high-entropy state. It thus seems like we could see what we actually see and not be at risk of spontaneously turning into static. Our current observable universe has a certain amount of entropy and had a certain amount before the current time. If we were in some different bubble, the universe would presumably look quite different, and probably only certain bubbles could generate conscious observers, and those bubbles would not be at risk of spontaneously maximizing entropy.

The argument as applied to consciousness makes perfect sense, but at the very least I seem to be missing something about the universe argument.

• It thus seems like we could see what we actually see and not be at risk of spontaneously turning into static. Our current observable universe has a certain amount of entropy and had a certain amount before the current time.

If the low-entropy area of the universe was originally a spontaneous fluctuation in a bigger max-entropy universe, than that is vastly improbable.

Such a fluctuation is exponentially more likely for (linearly) smaller volumes of the universe. So the parsimonious explanation for what we see, on this theory, is that the part of the universe that has low entropy is the smallest which is still enough to generate our actual experience.

How small is “smallest”? Well, to begin with, it’s not large enough to include stars outside the Solar System; it’s vastly more likely that the light en route from those stars to Earth was spontaneously created, than that the stars themselves and all the empty space between (very low entropy!) were created millions of years earlier. So the parsimonious explanation is that any moment now, that light created en route is going to run out and we’ll start seeing static (or at least darkness) in the night sky.

Similarly: we have a long historical record in geology, archaeology, even written history. Did it all really happen? The parsimonious explanation says that it’s vastly more likely that an Earth with fossils was spontaneously created, than that an Earth with dinosaurs was created, who then became fossils. This is because the past light cone of, say, a billion-year-old Earth is much bigger than the past light cone of a 6000 year old earth. And so requires the spontaneous creation of a vastly bigger section of universe.

Finally, it’s vastly more likely that you were spontaneously created a second ago complete with all your memories, than that you really lived through what you remember. And it’s vastly more likely that the whole spontaneous creation was only a few light-seconds across, and not as big as it seems. In which case it’ll stop existing any moment now.

That’s the experience of a Boltzmann Brain.

• I agree. The idea that low-entropy pockets that form are totally immune to a simplicity prior seems unjustified to me. The universe may be in a high-entropy state, but it’s still got physical laws to follow! It’s not just doing things totally at random; that’s merely a convenient approximation. Maybe I am ignorant here, but it seems like the probability of a particular low-entropy bubble will be based on more than just its size.

• (Tomorrow I will argue that anthropic updates must be illegal and that the correct answer to the original problem must be “50%”.)

Is your intent here to argue both sides of the issue to help, well, lay out the issues, or is it your actual current position that anthropic updates really really are verbotten and that 50% is the really really correct answer?

• It’s my intent here to lay out my own confusion.

• It’s not enitrely clear what does t mean to create a number of “me”: my consciuousness is only one and cannot be more than one and I only can feel sensations from one sigle body. If the idea is just to generate a certain number of physical copies of my body and embed my present consciousness into one of them at random then the problem is at least clear and determined from a mathematical point of view: it seems to be a simple probability problem about conditional probability. You are asking what is the probability that an event happened in the past given the condition of some a priori possible consequence, it can be easily solved by Bayes’ formula and the probability is about one over 1 billion.

• In that case, why don’t you believe you’re a Boltzmann brain?

I think a portion of the confusion comes from implicit assumptions about what constitutes “you”, and an implicit semantics for how to manipulate the concept. Suppose that there are N (N large) instances of “you” processes that run on Boltzmann Brains, and M (M << N) that run in sensible copies of the world around me. Which one of them is “you”? If “you” is a particular one of the N that run on Boltzmann Brains, then which one is “you, 10 seconds from now”?

It seems like it ought to be possible to experience a short stream of random sensations; thus in a “Boltzmann Brains Dominate” multiverse, I ought to expect, a priori, that my experiences will be randomness, if I consider myself to be randomly sampled according to the uniform distribution on candidate me’s.

Updating on the fact that my experiences this instant are not random noise, if the “Boltzmann Brains Dominate” multiverse is the only hypothesis, I ought to still believe that I am a Boltzmann Brain with very high probability.

But the only copies of “me” that will have a “future” that interacts meaningfully with the decisions I make now are those copies of me that live in the sensible universe, or at least a vaguely sensible universe, where ” vaguely sensible” means “acts according to the usual rules of causality for at least long enough for me to get experience back that depends non-trivially upon what decision I took.

So my solution would be to admit that (a) we are not sure exactly what we mean when we use worlds like “me” in a universe/​multiverse with lots of copies of the physical correlates of “me”, and (b) that our values dictate that even if we conclude with high probability that we are Boltzmann Brains, we ought to condition on the negation of that, because actions outputted to a random environment are pointless.

• ISTM the problem of Boltzmann brains is irrelevant to the 50%-ers. Presumably, the 50%-ers are rational—e.g., willing to update on statistical studies significant at p=0.05. So they don’t object to the statistics of the situation; they’re objecting to the concept of “creating a billion of you”, such that you don’t know which one you are. If you had offered to roll a billion-sided die to determine their fate (check your local tabletop-gaming store), there would be no disagreement.

Of course, this problem of identity and continuity has been hashed out on OB/​LW before. But the Boltzmann-brain hypothesis doesn’t require more than one of you—just a lot of other people, something the 50%-ers have no philosophical problem with. It’s a challenge for a solipsist, not a 50%-er.

• “Why did the universe seem to start from a condition of low entropy?”

I’m confused here. If we don’t go with a big universe and instead just say that our observable universe is the whole thing, then tracing back time we find that it began with a very small volume. While it’s true that such a system wold necessarily have low entropy, that’s largely because small volume = not many different places to put things.

Alternative hypothesis: The universe began in a state of maximal entropy. This maximum value was “low” compared to present day because the early universe was small. As the universe expands, its maximum entropy grows. Its realized entropy also grows, just not as fast as its maximal entropy.

• BBs can’t make correct judgement about their reality. Their judgements are random. So 50 per cent BBs think that they are in non-random reality even if they are in random. So your experience doesn’t provide any information if you are BB or not. Only prior matters, and the prior is high.

• Their judgements are random. So 50 per cent BBs think that they are in non-random reality even if they are in random.

The quoted figure does not follow. Random, yes; but it’s not a coinflip. Given that a Boltzmann Brain can randomly appear with any set of memories, and given that the potential set of random universes is vastly larger than the potential set of non-random universes, I’d imagine that the odds of a randomly-selected Boltzmann Brain thinking it is in a non-random universe are pretty low...

• It will be true if BB would have time to think about their experiences and ability to come to logical conclusions. But BBs opinions are also random.

• Hmmm. If the Boltzmann Brain has no time to think and update its own opinions from its own memory, then it is overwhelmingly likely that it has no opinion one way or another about whether or not it is in a random universe. In fact, it is overwhelmingly likely that it does not even understand the question, because its mindspace does not include the concepts of both “random” and “universe”...

• Of course most BBs don’t not think about whether are they random or not. But from subset of BBs who have thoughts about it (we cant say they are thinking as it is longer process), its thoughts are random, and 50 per cent thinks that they are not random. So experience updating of BB probabilities is not strong, but I am still not afraid to be BB by two other reasons.

1. Any BB is a copy of a real observer, and so I am real. (depends of identity solving)

2. BBs and real observers are not dominating class of observers. There is a third class, that is Bolzmann supercomputers which simulate our reality. They a medium size fluctuation which are very effective in creation trillions of observers moments which are rather consistent. But small amount of randomness also exist in such simulated universes ( it could be experimentally found). Hope to elaborate the idea in long post soon.

• Found the similar idea in recent article about Boltzmann Brains:

“What we can do, however, is recognize that it’s no way to go through life. The data that an observer just like us has access to includes not only our physical environment, but all of the (purported) memories and knowledge in our brains. In a randomly-fluctuating scenario, there’s no reason for this “knowledge” to have any correlation whatsoever with the world outside our immediate sensory reach. In particular, it’s overwhelmingly likely that everything we think we know about the laws of physics, and the cosmological model we have constructed that predicts we are likely to be random fluctuations, has randomly fluctuated into our heads. There is certainly no reason to trust that our knowledge is accurate, or that we have correctly deduced the predictions of this cosmological model.” https://​​arxiv.org/​​pdf/​​1702.00850.pdf

• As I said before about skeptical scenarios: you cannot refute them by argument, by definition, because the person arguing for the skeptical scenario will say, “since you are in this skeptical scenario, your argument is wrong no matter how convincing it seems to you.”

But we do not believe those scenarios, and that includes the Boltzmann Brain theory, because they are not useful for any purpose. In other words, if you are a Boltzmann Brain, you have no idea what would be good to do, and in fact according to the theory you cannot do anything because you will not exist one second from now.

• I don’t think that’s descriptively true at all. Regardless of whether or not I see a useful way to address it, I still wouldn’t expect to dissolve momentarily with no warning.

Now, this may be because humans can’t easily believe in novel claims. But “my” experience certainly seems more coherent than one would expect a BB’s to seem, and this calls out for explanation.

• A Boltzmann brain has no way to know anything, reason to any conclusion, or whatever. So it has no way to know whether its experience should seem coherent or not. So your claim that this needs explanation is an unjustified assumption, if you are a Boltzmann brain.

• One man’s modus ponens is another man’s modus tollens. I don’t even believe that you believe the conclusion.

• Which conclusion? I believe that a Boltzmann brain cannot validly believe or reason about anything, and I certainly believe that I am not a Boltzmann brain.

More importantly, I believe everything I said there.

• Seems like you’re using a confusing definition of “believe”, but the point is that I disagree about our reasons for rejecting the claim that you’re a BB.

Note that according to your reasoning, any theory which says you’re a BB must give us a uniform distribution for all possible experiences. So rationally coming to assign high probability to that theory seems nearly impossible if your experience is not actually random.

• My reason for rejecting the claim of BB is that the claim is useless—and I am quite sure that is my reason. I would definitely reject it for that reason even if I had an argument that seemed extremely convincing to me that there is a 95% chance I am a BB.

A theory that says I am a BB cannot assign a probability to anything, not even by giving a uniform distribution. A BB theory is like a theory that says, “you are always wrong.” You cannot get any probability assignments from that, since as soon as you bring them up, the theory will say your assignments are wrong. In a similar way, a BB theory implies that you have never learned or studied probability theory. So you do not know whether probabilities should sum to 100% (or to any similar normalized result) or anything else about probability theory.

As I said, BB theory is useless—and part of its uselessness is that it cannot imply any conclusions, not even any kind of prior over your experiences.

1. I’m using probability to represent personal uncertainty, and I am not a BB. So I think I can legitimately assign the theory a distribution to represent uncertainty, even if believing the theory would make me more uncertain than that. (Note that if we try to include radical logical uncertainty in the distribution, it’s hard to argue the numbers would change. If a uniform distribution “is wrong,” how would I know what I should be assigning high probability to?)

2. I don’t think you assign a 95% chance to being a BB, or even that you could do so without severe mental illness. Because for starters:

3. Humans who really believe their actions mean nothing don’t say, “I’ll just pretend that isn’t so.” They stop functioning. Perhaps you meant the bar is literally 5% for meaningful action, and if you thought it was 0.1% you’d stop typing?

4. I would agree if you’d said that evolution hardwired certain premises or approximate priors into us ‘because it was useful’ to evolution. I do not believe that humans can use the sort of pascalian reasoning you claim to use here, not when the issue is BB or not BB. Nor do I believe it is in any way necessary. (Also, the link doesn’t make this clear, but a true prior would need to include conditional probabilities under all theories being considered. Humans, too, start life with a sketch of conditional probabilities.)

• META: I made a comment in discussion about the article and add there my consideration why it is not bad to be BB, may be we could move discussion there?

• If I wake up in a red room after the coin toss, I’m going to assume that there are a billion of us in red rooms, and one in a green room, and vice versa. That way a billion of me are assuming the truth, and one is not. So chances are (Billion-and-one out of billion) that this iteration of me is assuming the truth.

We’ll each have to accept, of course, the possibility of being wrong, but hey, it’s still the best option for me altogether.

Tomorrow I’ll talk about what sort of trouble you run into if you reply “a billion to one”.

Trouble? We’ll take it on together, because every “I” is in this team. [applause]

• Eliezer_Yudkowsky wrote: “I want to reply, “But then most people don’t have experiences this ordered, so finding myself with an ordered experience is, on your hypothesis, very surprising.”

One will feel surprised by winning a million dollar on the lottery too, but that doesn’t mean that it would be rational to assume that just because one won a million dollar on the lottery most people win a million dollar on the lottery.

Maybe most of us exist only for a fraction of a second, but in that case, what is there to lose by (probably falsely, but maybe maybe maybe correctly) assuming that we exist much longer than that, and living accordingly? There is potentially something to gain by assuming that, and nothing to lose, so it may very well be rational to assume that, even though it is very unlikely to be the case!

• How much resources should you devote to the next day vs. the next month vs. the next year? If each additional second of existence is a vast improbability, for simplicity you may assume a few moments of existence, but no longer.

If, OTOH, once you live, say, 3 seconds, it’s as likely as not that you’ll live a few more years—there’s some sort of bimodality—then such a stance is justified. Bimodality would only work if there were some sort of theoretical justification.

• If everything that can happen, happens (sooner or later) - which is assumed

• there will be continuations (not necessarily at the same spot in spacetime, but somewhere) of whatever brief life I have for a few seconds or planck times now, and continuations of those continuations too, and so on, without an end, meaning I’m immortal, given that identity is not dependent on the survival of any particular atoms (as opposed to patterns in which atoms, any atoms, are arranged, anywhere). This means that what I achieve during the short existences that are most common in the universe will only be parts of what I will have achieved in the long run, when all those short existences are “put together” (or thought of as one continuous life). Therefore, I should care about what my life will be like in a few years, in a few centuries, in a few googol years, et cetera, together, that is, my whole infinitely long future, more than I should care about any one short existence at any one place in spacetime. If I can maximize my overall happiness over my infinite life only by accepting a huge lot of suffering for a hundred years beginning now, I should do just that (if I’m a rational egoist).

My life may very well consist of predominantly extremely short-lived Boltzmann-brains, but I don’t die just because these Boltzmann-brains die off one by one at a terrific rate.

• I said “how much” not “if”. My point is that you should care vastly more about the next few seconds then a few years from now.

• I am a Boltzmann brain atheist. ;)

• Boltzmann brains are a problem even if you’re a 50 percenter. Many fixed models of physics produce lots of BB. Maybe you can solve this with a complexity prior, that BB are less real because they’re hard to locate. But having done this, it’s not clear to me how this interacts with Sleeping Beauty. It may well be that such a prior also favors worlds with fewer BB, that is, worlds with fewer observers, but more properly weighted observers.

(ETA: I read the post backwards, so that was a non sequitur, but I do think the application of anthropics to BB is not at all clear. I agree with Eliezer that it looks like it helps, but it might well make it worse.)

• Here’s a logic puzzle that may have some vague relevance to the topic.

You and two teammates are all going to be taken into separate rooms and have flags put on your heads. Each flag has a 50% chance of being black or being white. None of you can see what color your own flag is, but you will be told what color flags your two teammates are wearing. Before each of you leave your respective rooms, you may make a guess as to what color flag you yourself are wearing. If at least one of you guesses correctly and nobody guesses incorrectly, you all win. If anyone makes an incorrect guess, or if all three of you decide not to guess, you all lose.

If one of you guesses randomly and the other two choose not to guess, you have a 50% chance of winning. Even though it would seem that knowing what color your teammates’ flags are tells you nothing about your own, there is a way for your team to win this game more than half the time. How can it be done?

• My attempt at a solution: if you see two flags of the same color, guess the opposite color, otherwise don’t guess. This wins 75% of the time.

Lemma 1: it’s impossible that everyone chooses not to guess. Proof: some two people have the same color, because there are three people and only two colors.

Lemma 2: the chance of losing is 25%. Proof: by lemma 1, the team can only lose if someone guessed wrong, which implies all three colors are the same, which is 2 out of 8 possible assignments.

This leaves open the question of whether this strategy is optimal. I highly suspect it is, but don’t have a proof yet.

UPDATE: here’s a proof I just found on the Internet, it’s elegant but not easy to come up with. I wonder if there’s a simpler one.

• It’s a tricky category of question alright—you can make it even trickier by varying the procedure by which the copies are created.

The best answer I’ve come up with so far is to just maximize total utility. Thus, I choose the billion to one side because it maximizes the number of copies of me that hold true beliefs. I will be interested to see whether my procedure withstands your argument in the other direction.

(And of course there is the other complication that strictly speaking the probability of a logical coin is either zero or one, we just don’t know which. But even though such logical uncertainties are not strictly speaking matters of probability, it is sometimes most useful to treat them as such in a particular context.)

• Well, I don’t think the analogy holds up all that well. In the coin flip story we “know” that there was a time before the universe with two equally likely rules for the universe. In the world as it is, AFAIK we really don’t have a complete, internally consistent set of physical laws fully capable of explaining the universe as we experience it, let alone a complete set of all of them.

The idea that we live in some sort of low entropy bubble which spontaneously formed in a high entropy greater universe seems pretty implausible for the reasons you describe. But I don’t think we can come to a conclusion from this significantly stronger than “there’s a lot we haven’t figured out yet”.

• Current physics models get around that question anyways. The way our brains work, there is more entropy after a memory is burned than before. Thus, time seems to flow from low to high entropy to us. If entropy was flowing the another direction, than our brains would think of another direction as past. The laws of thermodynamics are a side effect of how our brains process time.

Thus we can have low entropy → high entropy without a shit ton of Boltzmann Brains.

• The laws of thermodynamics arise in practically any reversible cellular automaton with a temperature—they are not to do with brains.

• The laws of thermodynamics arise in our analysis of practically any reversible cellular automaton with a temperature.

• This one always reminds me of flies repeatedly slamming their heads against a closed window rather than to face the fact that there is something fundamentally wrong with some of our unproven assumptions about thermodynamics and the big bang.

• ...care to explain further why we’re wrong?

• Do you really want to see the answer?

• I’d like to be the first to point out that this post doubles as a very long (and very undeserved) response to this post.

• 8 Sep 2009 15:13 UTC
−2 points

Non-scientific hypothesis: The universe’s initial state was a singularity as postulated by the big bang theory, a state of minimal entropy. As per thermodynamics, entropy has been, is, and will be increasing steadily from that point until precisely 10^40 years from the Big Bang, at which point the universe will cease to exist with no warning whatsoever.

Though this hypothesis is very arbitrary (the figure “10^40 years” has roughly 300 bits of entropy), I figure it explains our observations at least 300 bits better than the “vanilla heat death hypothesis”: The universe’s initial state was . . . a state of minimal entropy. It reaches maximal entropy very quickly compared to the amount of time it spends in maximal entropy, and as a result effectively has no order.

• Are past events a guide to the future—and if so, why?

That seems to be the topic here:

http://​​en.wikipedia.org/​​wiki/​​Problem_of_induction

• Regarding this post’s score (as of now, −3 points): was it really that harmful?

• was it really that harmful?

Yes, even more so the tendency to make like comments.

• FWIW, if punishment was intended, it is unlikely to be effective: I pretty-much just ignore the Less-Wrong karma system—partly on the grounds that critics should heed criticism the least.

• In most cases, my complaint is that your comments lack relevance or substance, which has nothing to do with disagreement.