Ethics in Many Worlds

There seems to be a strong case for the Everett or ‘many-worlds’ interpretation (MWI) of quantum mechanics. The world as revealed by MWI is so grotesquely superfluous — so offensively alien — that it’s natural to imagine similarly bizarre ethical implications. If MWI is true, I am constantly ‘splitting’ into an unfathomable number of people. Shouldn’t that make some difference to what I ought to do, some of the time?

Some suggestions have already been floated. For instance — if I take the wise option in this world, does that increase the chance of another ‘me’ taking the unwise option? Does that mean prudential behaviour is really selfish behaviour? And shouldn’t I happily take up the smallest incentive to play quantum Russian roulette, given that I know I’ll only ever experience winning? Doesn’t the virtually guaranteed safety of at least some ‘branches’ imply we should care less about mitigating risks with low probabilities and high stakes — particularly the risk of snuffing out the human flame, and our cosmic significance along with it? And so on. These are muddled thoughts, for the most part. Just remember Egan’s Law, Eliezer says: it all adds up to normality.

But I can’t help thinking there is a sense in which it doesn’t. This is because MWI seems to imply that many more minds will exist tomorrow than exist today — or in the next ten seconds, for that matter. So the effects of our actions on even the marginally further future dwarf the effects on the marginally nearer future in importance. And that often makes a practical difference to what we ought to do. The thought isn’t complicated, which makes me suspect I’ve missed something obvious.

To restate in a wordier way: if you’re a (total, aggregative) consequentialist, what determines the value of an act is its effect on the sum total of good and bad consequences. Just as some event is no less morally significant because it happened far away, so it seems wrong to discount consequences over time. And whether you decide to apply a discount rate for future consequences makes a material difference to which acts are best today. It seems like that the branching nature of MWI has the opposite effect of a social discount rate: it seems to imply that some outcome increases in importance as it recedes in time from the present moment into the future.


Take a toy model of MWI, where each ‘branch’ of the multiverse splits into two discrete copies every 24 hours. On Day 0, there is just our world (along with many other neighboring but causally isolated branches). On Day 1, there will be ten copies of our world and every other neighboring world — ten times as many worlds as Day 0. On Day 10, there are branches of branches… of the world in Day 0, all now causally isolated from one another. Suppose on Day 0 I decide I’m due for a trip to the spa. Going to the spa is pure indulgence for me: I accrue no harms or benefits after the trip; nothing which could compound over time, no particularly fond memories or giddy antipication. I can go today for a 5-hour session, or in 10 days time for a 2-hour session. On the day I choose not to go to the spa, I’ll be working, which I don’t find particularly enjoyable. I need to book now in any case, and both treatments cost the same. Which do I choose? Even without discounting future welfare, the clear common-sense answer is to book the treatment today: it costs no more and lasts an extra 3 hours. No-brainer. But on this toy version of MWI, booking the 2-hour treatment means copies of me get to enjoy the spa treatment, and only one of me has to endure an extra day of work. Booking the 5-hour treatment means one of me gets 5 hours of spa time, and of me endures an extra day of work. It’s not even close — choosing the 2-hour treatment is better by many, many orders of magnitude.

The example can be run where we’re weighing between harms too. Suppose my dentist is due to receive some flashy new equipment next week which will cut the time for my procedure in half. I can receive the one-hour procedure today, or the half-hour procedure ten days from now. I don’t dread dentist trips but I’m no fan of them — I just want to minimise the amount of time in the chair, feeling vaguely scared and uncomfortable. Which procedure do I take? Just like before, common-sense says: the procedure in ten days time. But on MWI, taking the half-hour procedure entails hours of discomfort. The hour-long procedure entails an hour of discomfort. The decision is again absolutely unequivocal — I should want to take the procedure today even if it had to last all day!

Actually, these examples are potentially confusing insofar as they make it sound as if the question is about what becomes self-interestedly rational on MWI. Maybe MWI has similar implications with respect to self-interested rationality, but probably a full answer would involve thinking seriously about personal identity. Some quick thoughts: there’s a naïve view that says ‘I’ will follow one and only one ‘groove’ along the many paths traced out by the branches of the multiverse. There’s zero reason to think this is the case — it would involve some extra metaphysical paraphernalia in addition to MWI. Maybe instead I should weight the interests of a future copy of myself in inverse proportion to how many copies there are at that time. Or maybe I die a philosophical death at the moment of branching, so ‘self-interest’ is no guide to what counts as rational. In any case, the question of what is in my self-interest is is a red herring, and probably more complicated than the question of what is morally best, which is what I’m interested in. Either imagine you’re an impartial altruist in the above examples, or else imagine you’re deciding on behalf of another person. For instance, do I surprise my partner with a shorter spa trip next week, or a longer one today? What matters is that, in this toy example, consequentialism recommends choosing choosing the shorter + more temporally distant option when the world does branch, and the longer + temporally closer option when it doesn’t.

It should be clear how my terrible examples generalise: on MWI, the stakes get higher as time goes on. The upshot is that it might sometimes be morally best to make ‘sacrifices’ in the short-term to avoid exponentially multiplied amounts of harm, or do exponentially multiplied amounts of good, in the longer-term — sacrifices which seem bizarre and unjustifiable on a non-branching interpretation of QM.

Note that you do not need to be a card-carrying consequentialist to buy into this. By its nature, the exponential multiplication of good and bad consequences might just crowd out other considerations even if you initially place relatively little weight on total, aggregated consequences. It is true that an average (versus total) consequentialism would not carry these odd implications, because average consequentialism is not going to be sensitive to differences in the sum total of good or bad things over time insofar as those differences don’t affect the balance of good over bad things. Eliezer says something along these lines:

Average utilitarianism suddenly looks a lot more attractive—you don’t need to worry about creating as many people as possible, because there are already plenty of people exploring person-space.

This doesn’t strike me as a good reason to prefer average utilitarianism, and in any case AU has bizarre implications of its own, and seems to be fairly unpopular in the literature for that reason.

Note also that the strange implications of the toy example don’t hinge on whether you’re a hedonistic consequentialist. In other words, it doesn’t matter what you think makes an outcome good or bad, as long as ‘branching’ the world makes there be more of it. In saying “many more minds will exist tomorrow than exist today” I’m ignoring the possibility that non-psychological things might matter, but you’re welcome to think they do.

Lastly, the conclusion seems strong enough to force its way through empirical uncertainty about whether MWI is true. Like Pascal’s wager and mugging, the stakes are so much higher on MWI than on non-branching interpretations that the option favoured by MWI is almost always going to win out unless your credence in MWI is stupidly small.

Reality Testing

I think there are two questions here: first, does the toy example imply the things I think it implies? Second, (how) do these implications carry over onto a sensible, modern understanding of MWI? I’ll say something about this second question now.

One major difference between the toy example and worked-out version of MWI is that the branching takes place far, far more often. To my lights, all this means is that the practical differences between common-sense and ‘Everettian’ ethics become all the greater — from one fraction of a second to the next, there are many, many orders of magnitude more branches of ‘me’.

Another major difference is that ‘branching’ or ‘splitting’ never literally occurs — these are cartoonish metaphors at best. De Witt (1970) popularised the picture of definite events cleaving the universe into countable, fully discrete and non-interacting worlds, but modern approaches appeal instead to decoherence. To the best of my understanding: decoherence describes how interference terms are suppressed as quantum systems becom eembedded within sufficiently complex macroscopic systems, such that macroscopic systems can be treated classically to a very close approximation while their quantum constituents remain superposed as described by the wave function. This large-scale interference explains why we do not appear to observe or measure superposed states. Yet, no clean ‘cut’ between classical and quantum objects is required. Rather, measurement apparatuses, and the ‘worlds’ they occupy, are understood as emergent macroscopic entities on par with cells, people, tables, and planets: explanatorily indispensable but not precisely definable. The ‘worlds’ that preponderate from one time to the next are therefore not countably discrete at all, nor are they perfectly causally isolated — they’re approximations, or abstractions, or patterns — like squinting at an apple and modeling it as a sphere.

Does this make a moral difference from the toy example? I wouldn’t have thought so. After all, people are imprecisely defined pattern-like things even in the (familiar, non-Everettian) world. And the argument hinges on there being more minds, or experiences, or people, or whatever else from one time to the next; but not that those things are always and precisely countable. However, I reached out to a philosopher of physics about this, who told me that they thought the conclusion I’m suggesting probably relies essentially on the ability to ‘count branches’. If instead you think there is a measure over branches, where the total measure of branches is constant over time, then probably things really do just add up to normality.

How else might the conclusion from the toy example fail to carry over to serious versions of MWI? I can think of some plausible and some less plausible-sounding reasons, although I don’t know enough to respond to them in any decisive way. One response points to another way in which the idea of new universes being ‘created’ by a process of ‘splitting’ is inaccurate: the many worlds have always existed, and measurement (broadly construed) just screens them off from interacting. This strongly suggests, to speak very loosely, that the minds split apart on Day 1 are already there on Day 0, ‘overlapping’ with but distinct from one another before they diverge. As such, rather than somehow becoming two people, observing e.g. the result of a Schrödinger’s cat experiment just reveals which world you were in all along. Picture this animation, or the image of the fraying end of a rope: all the threads being present in the un-frayed end. The upshot is that there isn’t radically more of anything that matters from one time to the next.

Another, similar, response says that my mind at time splits into many branches at such that the amount of what matters is conserved by a kind of ‘dilution’. Suppose my mind splits into 10 between and . Well, the amount those 10 minds matter is reduced tenfold — perhaps each util or whatever is a 10 times fainter, or less ‘real’. Probably this view can be drawn from an analogy to the fact that MWI does not violate energy (or mass) conservation. Preumably stuff that matters is made out of stuff, and if the amount of stuff is conserved across time, then (all other things being equal) the sum total of what matters must be conserved across time. I’m not sure anyone thinks this, but it’s clearly not the case that e.g. the value of a mental state is tethered to mass or energy, holding everything else constant. Is a whole-brain emulation running lead components more valuable than a functionally equivalent emulation running on lighter materials?

More to the point, imagine constructing a conscious whole-brain emulation out of wires and relays; and then meticulously splitting each wire and component into two. Eventually, you have two identical constructions; each performing identical computations. If the original machine was conscious, then the resulting two machines are presumably conscious also. But were both ‘consciousnesses’ somehow residing in the original machine all along? Surely not. This leads to a reductio — supposing we can keep splitting each subsequent machine ten times over, the ‘there all along’ view says that the original machine was home to distinct, non-interacting loci of consciousness all along. This sounds like nonsense to me. Better to think that consciousness has much more to do with (or just is) computation or function. This bears on the first objection I mentioned, that if you thinking ‘counting branches’ doesn’t make sense, and instead think that there’s a measure over branches which is constant in time, then you don’t get massive explosions in the amount of conscious experience or number of minds over time. But if a mind is more like a pattern, then can’t many ‘overlapping worlds’ add up to a single mind? The image I have in my head is of a dozen sheets of cellophane, all printed with the same image, being stacked on one another in alignment. Are there a dozen images, or a single image? If they make up a single image, and ‘overlapping’ worlds make a single mind, then the idea of an ‘explosion of value’ is back on the cards. In any case, the question is: before a measurement, do multiple distinct loci of conscious experience ‘overlap’ one another, or not? Or does this make as much sense as the notion that distinct minds must all be running in parallel on a single machine, because its wires can be split in two ten times? I suspect the answer requires a technical understanding of MWI which I don’t have.

A final way in which the toy example may not carry over into grown-up MWI is that MWI in fact involves infinities of various kinds. Specifically, if there are infinitely many worlds at time and infinitely many worlds at , it might not be possible to claim that there are more worlds, or more of anything that matters, at than , even if it seems intuitively obvious. I don’t really have any comments about this. Amanda Askell’s thesis might be relevant.


I’m suggesting it seems plausible that MWI implies a constant exponential explosion of value over time — viz. the amount or number of things that mater, and in particular the number of minds or amount of conscious experience. This in turn seems to imply what would look like an insanely steep positive ‘markup’ rate for future welfare (to someone who doesn’t buy into MWI). In this way, Egan’s law fails: it all adds up to much, much more than normality. It seems like the major objection is that this idea relies essentially on the ability to ‘count branches’, and made false if there’s a measure over branches which is constant in time. One way this could be true is if distinct, non-interacting minds ‘overlap’ before they ‘split’. But it’s unclear to me whether having a measure over branches actually precludes this constant explosion of value, considering how a kind of functionalism or computationalism about mental states suggests that value derived from conscious experience needn’t be conserved with ‘branching’ in the way that e.g. mass is.

I should finish by underlining just how unqualified I am to draw any trustworthy conclusions: my entire knowledge of MWI comes from getting halfway through a QM textbook, reading a couple of popular science books about MWI, and taking a class in the philosophy of QM. I expect I’m almost certainly wrong about this, but I don’t know enough to know why. Very curious to hear from people who know what they’re talking about! Thanks for reading.