“our intuition of identical copy immortality”
Speak for yourself—I have no such intuition.
“our intuition of identical copy immortality”
Speak for yourself—I have no such intuition.
Wei, the relationship between computing power and the probability rule is interesting, but doesn’t do much to explain Born’s rule.
In the context of a many worlds interpretation, which I have to assume you are using since you write of splitting, it is a mistake to work with probabilities directly. Because the sum is always normalized to 1, probabilities deal (in part) with global information about the multiverse, but people easily forget that and think of them as local. The proper quantity to use is measure, which is the amount of consciousness that each type of observer has, such that effective probability is proportional to measure (by summing over the branches and normalizing). It is important to remember that total measure need not be conserved as a function of time.
So for the Ebborian example, if measure is proportional to the thickness squared, the fact that the probability of a slice can go up or down, depending purely on what happens to other slices that it otherwise would have nothing to do with, is neither surprising nor counterintuitive. The measure, of course, would not be affected by what the other slices do. It is just like saying that if the population of China were to increase, and other countries had constant population, then the effective probability that a typical person is American would decrease.
The second point is that, even supposing that quantum computers could solve hard math problems in polynomial time, your claim that intelligence would have little evolutionary value is both utterly far-fetched (quantum computers are hard to make, and nonlinear ones could be even harder) and irrelevant if we believe—as typical Everettians do—that the Born rule is not a seperate rule but must follow from the wave equation. Even supposing intelligence required the Born rule, that would just tell us that the Born rule is true—but we already know that. The question is, why would it follow from the wave equation? If the Born rule is a seperate rule, that suggests dualism or hidden variables, which bring in other possibilities for probability rules.
Actually there are already many other possibilities for probability rules. A lot of people, when trying to derive the Born rule, start out assuming that probabilities depend only on branch amplitudes. We know that seems true, but not why, so we can’t start out assuming it. For example, probabilities could have been proportional to brain size.
These issues are discussed in my eprints, e.g. Decision Theory is a Red Herring for the Many Worlds Interpretation http://arxiv.org/abs/0808.2415
Academician, what you are explicitly not saying is that the aspects of reality that give rise to consciousness can be described mathematically. Well, parts of your post seem to imply that the mathematically describable functions are what matter, but other parts deny it. So it’s confusing, rather than enlightening. But I’ll take you at your word that you are not just a reductionist.
So you are a “monist” but, as David Chalmers has described such positions, in the spirit of dualism. As far as I am concerned, you are a dualist, because the only interesting distinction I see is between mathematically describable reality vs. non-MD reality—and your “monism” has aspects of both.
Your argument seems to be that monism is simpler than dualism, so Occam’s Razor prefers it, so we should believe it. Hence, you define the stuff the world is made of as “whatever I am” and call it one kind of stuff.
I don’t see that as a useful approach, because what I want to know is whether MD stuff is enough, or whether we need something more, where ‘something more’ is explicitly mental-related. Remember, we want the simplest explanation that fits the evidence. So the question reduces to “Does an MD-only world fit the evidence from subjective experience?” That’s a hard question.
I am planning to write a post on the hard problem at some point, which I’ll post on my blog and here.
I agree that a claim of sound reasoning methodology is easy to fake, and the writer could easily be mistaken. So it’s very weak evidence. However, it’s not no evidence, because if the writer would have said “my belief in X is based on faith” that would probably decrease your trust in his conclusions compared to those of someone who didn’t make any claims about their methods.
Ata, there are many things wrong with your ideas. (Hopefully saying that doesn’t put you off—you want to become less wrong, I assume.)
it is more difficult to get to the point where it actually seems convincing and intuitively correct, until you independently invent it for yourself
I have indeed independently invented the “all math exists” idea myself, years ago. I used to believe it was almost certainly true. I have since downgraded its likelihood of being true to more like 50% as it has intractable problems.
If it saved a copy of the universe at the beginning of your life and repeatedly ran the simulation from there until your death (if any), would it mean anything to say that you are experiencing your life multiple times?
Of course. (Well, it might be better to say that multiple guys like you are experiencing their own lives.)
Otherwise, it would mean that all types of people have the same measure of consciousness. Thus, for example, the fact that people who seem to be products of Darwinian evolution are more numerous would mean nothing—they are more numerous in terms of copies, not in terms of types, so the typical observer would not be one. So more copies = more measure. A similar argument applies to high measure terms in the quantum wavefunction. None of these considerations change if we assume that all math structures exist.
how about if we’re being simulated by zero computers?
You assume that this would make no difference to our consciousness, but you don’t actually present any argument for that. You just assert it in the post. So I would have to say that your argument—being nonexistent—has zero credibility. That doesn’t mean that your conclusion must be false, just that your argument provides no evidence in favor of it. The measure argument shows that your conclusion is false—though with the caveat that Platonic computers might count as real enough to simulate us. So let’s continue.
By Occam’s Razor, I conclude that if a universe can exist in this way — as one giant subjunctive — then we must accept that that is how and why our universe does exist
So you are abandoning the question of “Why does anything exist?” in favor of just accepting that it does, which is what you warned against doing in the first place.
If all math must exist in a strong Platonic sense, then obviously, it does. If it merely can so exist as far as we know, or OTOH might not, then we have no answer as to why anything exists. “Nothing exists” would seem to be the simplest thing that might have been true, if we had no evidence otherwise.
That said, “everything exists” is prima facie simpler that “something exists” so, given that at least something exists, Occam’s Razor suggests that everything exists. Hence my interest in it.
There’s a problem, though.
If every possible mathematical structure is real in the same way that this universe is, then isn’t there only an infinitesimal probability that this universe will turn out to be ruled entirely by simple regularities?
Good question. There is an argument based on Turing machines that the simplest programs (i.e. laws of physics) have more measure, because a random string is more likely to have a short segment at the beginning that works well and then a random section of ‘don’t care’ bits, as opposed to needing a long string that all works as part of the program. So if we run all TM programs Platonically, simpler “laws of physics” have more measure, possibly resulting in universes like ours being typical. Great, right?
But there are problems with this. First, there are many possible TMs that could run such programs. We need to choose one—but such a choice contradicts the “inevitable” nature that Platonism is supposed to have. So why not just use all of them? There are infinitely many, so there is no unique measure to use for them. Any choice we can make of how to run them all is inevitably arbritrary, and thus, we are back to “something” rather than “everything”. We can have a very “big” something, since all programs do run, but it’s still something—some nonzero information that pure math doesn’t know anything about.
That’s just TMs, but there’s no reason other types of math structures such as continuous functions shouldn’t exist, and we don’t even have the equivalent of a TM to put a measure distribution on them.
I don’t know for sure that there isn’t some natural measure, but if there is I don’t think we can know about it. Maybe I’m overlooking some selection effect that makes things work without arbritrariness.
Ok, so suppose we ignore the arbritrariness problem. The resulting ‘everything’ might not be Platonism, but at least it would be a high level and fairly simple theory of physics. Does the TM measure in fact predict a universe like ours?
I don’t know. Selecting a fairly simple TM, in practice the differences resulting from choice of TM are negligable. But we still have the Boltzmann brain question. I don’t know if a BB is typical in such an ensemble or not. At least that is a question that can be studied mathematically.
It’s not a Newcomb problem. It’s a problem of how much his promises mean.
Either he created a large enough cost to leaving if he is unhappy, in that he would have to break his promise, to justify his belief that he won’t leave; or, he did not. If he did, he doesn’t have the option to “take both” and get the utility from both because that would incur the cost. (Breaking his promise would have negative utility to him in and of itself.) It sounds like that’s what ended up happening. If he did not, he doesn’t have the option to propose sincerely, since he knows it’s not true that he will surely not leave.
Your first argument seems to say that if someone simulated universe A a thousand times and then simulated universe B once, and you knew only that you were in one of those simulations, then you’d expect to be in universe A.
That’s right, Nisan (all else being equal, such as A and B having the same # of observers).
I don’t see why your prior should assign equal probabilities to all instances of simulation rather than assigning equal probabilities to all computationally distinct simulations.
In the latter case, at least in a large enough universe (or quantum MWI, or the Everything), the prior probability of being a Boltzmann brain (not product of Darwinian evolution) would be nearly 1, since most distinct brain types are. We are not BBs (perhaps not prior info, but certainly info we have) so we must reject that method.
What if you run a simulation of universe A on a computer whose memory is mirrored a thousand times on back-up hard disks? … Does this count as a thousand copies of you?
No. That is not a case of independent implementations, so it just has the measure of a single A.
As for wavefunction amplitudes, I don’t see why that should have anything to do with the number of instantiations of a simulation.
A similar argument applies - more amplitude means more measure, or we would probably be BB’s. Also, in the Turing machine version of the Tegmarkian everything, that could only be explained by more copies.
For an argument that even in the regular MWI, more amplitude means more implementations (copies), as well as discussion of what exactly counts as an implementation of a computation, see my paper
Interesting. Do you know of place on the net where I can see what other (independent, mathematically knowledgeable) people have to say about its implications? It’s asking for a lot maybe, but I think that would be the most efficient way for me to gain info about it, if there is.
rwallace, nice reductio ad adsurdum of what I will call the Subjective Probability Anticipation Fallacy (SPAF). It is somewhat important because the SPAF seems much like, and may be the cause of, the Quantum Immortality Fallacy (QIF).
You are on the right track. What you are missing though is an account of how to deal properly with anthropic reasoning, probability, and decisions. For that see my paper on the ‘Quantum Immortality’ fallacy. I also explain it concisely on on my blog on Meaning of Probability in an MWI.
Basically, personal identity is not fundamental. For practical purposes, there are various kinds of effective probabilities. There is no actual randomness involved.
It is a mistake to work with ‘probabilities’ directly. Because the sum is always normalized to 1, ‘probabilities’ deal (in part) with global information, but people easily forget that and think of them as local. The proper quantity to use is measure, which is the amount of consciousness that each type of observer has, such that effective probability is proportional to measure (by summing over the branches and normalizing). It is important to remember that total measure need not be conserved as a function of time.
As for the bottom line: If there are 100 copies, they all have equal measure, and for all practical purposes have equal effective probability.
A—A hundred people are created in a hundred rooms. Room 1 has a red door (on the outside), the outsides of all other doors are blue. You wake up in a room, fully aware of these facts; what probability should you put on being inside a room with a blue door?
Here, the probability is certainly 99%.
Sure.
B—same as before, but an hour after you wake up, it is announced that a coin will be flipped, and if it comes up heads, the guy behind the red door will be killed, and if it comes up tails, everyone behind a blue door will be killed. A few minutes later, it is announced that whoever was to be killed has been killed. What are your odds of being blue-doored now?
There should be no difference from A; since your odds of dying are exactly fifty-fifty whether you are blue-doored or red-doored, your probability estimate should not change upon being updated.
Wrong. Your epistemic situation is no longer the same after the announcement.
In a single-run (one-small-world) scenario, the coin has a 50% to come up tails or heads. (In a MWI or large universe with similar situations, it would come up both, which changes the results. The MWI predictions match yours but don’t back the SIA). Here I assume the single-run case.
The prior for the coin result is 0.5 for heads, 0.5 for tails.
Before the killing, P(red|heads) = P(red|tails) = 0.01 and P(blue|heads) = P(blue|tails) = 0.99. So far we agree.
P(red|before) = 0.5 (0.01) + 0.5 (0.01) = 0.01
Afterwards, P’(red|heads) = 0, P’(red|tails) = 1, P’(blue|heads) = 1, P’(blue|tails) = 0.
P(red|after) = 0.5 (0) + 0.5 (1) = 0.5
So after the killing, you should expect either color door to be 50% likely.
This, of course, is exactly what the SIA denies. The SIA is obviously false.
So why does the result seem counterintuitive? Because in practice, and certainly when we evolved and were trained, single-shot situations didn’t occur.
So let’s look at the MWI case. Heads and tails both occur, but each with 50% of the original measure.
Before the killing, we again have P(heads) =P(tails) = 0.5
and P(red|heads) = P(red|tails) = 0.01 and P(blue|heads) = P(blue|tails) = 0.99.
Afterwards, P’(red|heads) = 0, P’(red|tails) = 1, P’(blue|heads) = 1, P’(blue|tails) = 0.
Huh? Didn’t I say it was different? It sure is, because afterwards, we no longer have P(heads) = P(tails) = 0.5. On the contrary, most of the conscious measure (# of people) now resides behind the blue doors. We now have for the effective probabilities P(heads) = 0.99, P(tails) = 0.01.
P(red|after) = 0.99 (0) + 0.01 (1) = 0.01
That kind of anthropic reasoning is only useful in the context of comparing hypotheses, Bayesian style. Conditional probabilities matter only if they are different given different models.
For most possible models of physics, e.g. X and Y, P(Finn|X) = P(Finn|Y). Thus, that particular piece of info is not very useful for distinguishing models for physics.
OTOH, P(21st century|X) may be >> P(21st century|Y). So anthropic reasoning is useful in that case.
As for the reference class, “people asking these kinds of questions” is probably the best choice. Thus I wouldn’t put any stock in the idea that animals aren’t conscious.
Another reason I wouldn’t put any stock in the idea that animals aren’t conscious is that the complexity cost of a model in we are and they (other animals with complex brains) are not is many bits of information. 20 bits gives a prior probability factor of 10^-6 (2^-20). I’d say that would outweigh the larger # of animals, even if you were to include the animals in the reference class.
I am very skeptical about SIA
Righly so, since the SIA is false.
The Doomsday argument is correct as far as it goes, though my view of the most likely filter is environmental degradation + AI will have problems.
Sounds cool. I’m from NYC, but no longer live there. I was a member of athiest clubs in college, but I’d bet that post-college (or any, really) rationalists have a hard time meeting others of similar views.
the justification for reasoning anthropically is that the set Ω of observers in your reference class maximizes its combined winnings on bets if all members of Ω reason anthropically
That is a justification for it, yes.
When most of the members of Ω arise from merely non-actual possible worlds, this reasoning is defensible.
Roko, on what do you base that statement? Non-actual observers do not participate in bets.
The SIA is not an example of anthropic reasoning; anthropic implies observers, not “non-actual observers”.
See this post for an example of the difference, showing why the SIA is false.
No
Why do I get the feeling you’re shouting, Academician? Let’s not get into that kind of contest. Now here’s why you’re wrong:
P(red|before) =0.01 is not equal to P(red).
P(red) would be the probability of being in a red room given no information about whether the killing has occured; i.e. no information about what time it is.
The killing is not just an information update; it’s a change in the # and proportions of observers.
Since (as I proved) P(red|after) = 0.5, while P(red|before) =0.01, that means that P(red) will depend on how much time there is before as compared to after.
That also means that P(after) depends on the amount of time before as compared to after. That should be fairly clear. Without any killings or change in # of observers, if there is twice as much time after an event X than before, then P(after X) = 2⁄3. That’s the fraction of observer-moments that are after X.
Cupholder:
That is an excellent illustration … of the many-worlds (or many-trials) case. Frequentist counting works fine for repeated situations.
The one-shot case requires Bayesian thinking, not frequentist. The answer I gave is the correct one, because observers do not gain any information about whether the coin was heads or tails. The number of observers that see each result is not the same, but the only observers that actually see any result afterwards are the ones in either heads-world or tails-world; you can’t count them all as if they all exist.
It would probably be easier for you to understand an equivalent situation: instead of a coin flip, we will use the 1 millionth digit of pi in binary notation. There is only one actual answer, but assume we don’t have the math skills and resources to calculate it, so we use Bayesian subjective probability.
I omitted the “|before” for brevity, as is customary in Bayes’ theorem.
That is not correct. The prior that is customary in using Bayes’ theorem is the one which applies in the absence of additional information, not before an event that changes the numbers of observers.
For example, suppose we know that x=1,2,or 3. Our prior assigns 1⁄3 probability to each, so P(1) = 1⁄3. Then we find out “x is odd”, so we update, getting P(1|odd) = 1⁄2. That is the standard use of Bayes’ theorem, in which only our information changes.
OTOH, suppose that before time T there are 99 red door observers and 1 blue door one, and after time T, there is 1 red door are 99 blue door ones. Suppose also that there is the same amount of lifetime before and after T. If we don’t know what time it is, clearly P(red) = 1⁄2. That’s what P(red) means. If we know that it’s before T, then update on that info, we get P(red|before)=0.99.
Note the distinction: “before an event” is not the same thing as “in the absence of information”. In practice, often it is equivalent because we only learn info about the outcome after the event and because the number of observers stays constant. That makes it easy for people to get confused in cases where that no longer applies.
Now, suppose we ask a different question. Like in the case we were considering, the coin will be flipped and red or blue door observers will be killed; and it’s a one-shot deal. But now, there will be a time delay after the coin has been flipped but before any observers are killed. Suppose we know that we are such observers after the flip but before the killing.
During this time, what is P(red|after flip & before killing)? In this case, all 100 observers are still alive, so there are 99 blue door ones and 1 red door one, so it is 0.01. That case presents no problems for your intuition, because it doesn’t involve changes in the #’s of observers. It’s what you get with just an info update.
Then the killing occurs. Either 1 red observer is killed, or 99 blue observers are killed. Either outcome is equally likely.
In the actual resulting world, there is only one kind of observer left, so we can’t do an observer count to find the probabilities like we could in the many-worlds case (and as cupholder’s diagram would suggest). Whichever kind of observer is left, you can only be that kind, so you learn nothing about what the coin result was.
Actually, if we consider that you could have been an observer-moment either before or after the killing, finding yourself to be after it does increase your subjective probability that fewer observers were killed. However, this effect goes away if the amount of time before the killing was very short compared to the time afterwards, since you’d probably find yourself afterwards in either case; and the case we’re really interested in, the SIA, is the limit when the time before goes to 0.
Actually, if we consider that you could have been an observer-moment either before or after the killing, finding yourself to be after it does increase your subjective probability that fewer observers were killed. However, this effect goes away if the amount of time before the killing was very short compared to the time afterwards, since you’d probably find yourself afterwards in either case; and the case we’re really interested in, the SIA, is the limit when the time before goes to 0.
I just wanted to follow up on this remark I made. There is a suble anthropic selection effect that I didn’t include in my original analysis. As we will see, the result I derived applies if the time after is long enough, as in the SIA limit.
Let the amount of time before the killing be T1, and after (until all observers die), T2. So if there were no killing, P(after) = T2/(T2+T1). It is the ratio of the total measure of observer-moments after the killing divided by the total (after + before).
If the 1 red observer is killed (heads), then P(after|heads) = 99 T2 / (99 T2 + 100 T1)
If the 99 blue observers are killed (tails), then P(after|tails) = 1 T2 / (1 T2 + 100 T1)
P(after) = P(after|heads) P(heads) + P(after|tails) P(tails)
For example, if T1 = T2, we get P(after|heads) = 0.497, P(after|tails) = 0.0099, and P(after) = 0.497 (0.5) + 0.0099 (0.5) = 0.254
So here P(tails|after) = P(after|tails) P(tails) / P(after) = 0.0099 (.5) / (0.254) = 0.0195, or about 2%. So here we can be 98% confident to be blue observers if we are after the killing. Note, it is not 99%.
Now, in the relevant-to-SIA limit T2 >> T1, we get P(after|heads) ~ 1, P(after|tails) ~1, and P(after) ~1.
In this limit P(tails|after) = P(after|tails) P(tails) / P(after) ~ P(tails) = 0.5
So the SIA is false.
Supposedly “we get the intuition that in a copying scenario, killing all but one of the copies simply shifts the route that my worldline of conscious experience takes from one copy to another”? That, of course, is a completely wrong intuition which I feel no attraction to whatsoever. Killing one does nothing to increase consciousness in the others.
See “Many-Worlds Interpretations Can Not Imply ‘Quantum Immortality’”
http://arxiv.org/abs/0902.0187