# A Plausible Entropic Decision Procedure for Many Worlds Living, Round 2

Hey LessWrong! I posted about a month ago about a decision procedure that I think could be optimal in a universe where the Many Worlds Interpretation is true. The post was downvoted to zero and at least several people thought there were problems with it. However, none of the comments swayed me to believe that my idea has been falsified yet, so I clarified my idea, rewrote a post about it, and am interested again in your feedback. It is cross-posted from my blog here.

## Epistemic Status

While this idea seems logically coherent to me given my conception of MWI, my prior degree-of-belief in my idea is that it’s more than likely false because I’m a layperson and it’s one of the first ideas that came to me when thinking about decision theory under MWI.

I slowly lead into explaining my proposal because I think understanding the context of the problem will make my idea more intuitive. So let me begin:

## The Problem

In any binary decision problem that we face with options A and B, we want to use the available evidence in some decision procedure to decide which option we take. Traditionally, the idea is that when it comes time to make a decision, an agent simply ought to choose whichever option looks more choice-worthy (has more expected utility). If their computations and cost-benefit analysis show that A looks even slightly like a better option then B, they want to go with A every single time.

However, if Many Worlds is true, I think that making decisions in this fashion ‘damns’ the copies of this agent in nearly all the nearby child worlds to making the same decision. By nearby child worlds, I mean the sister worlds that have recently branched out from a common Everett branch. For example, suppose Bob is trying to decide to go left or right at an intersection. In the moments where he is deciding to go either left or right, many nearly identical copies in nearly identical scenarios are created. They are almost entirely all the same, and if one Bob decides to go left, one can assume that 99%+ of Bobs made the same decision. This is fine if going left was the best decision, but what if it’s not? If he happens to subsequently be killed by a mountain lion who was hiding in the bushes, nearly all the Bobs in all the worlds that were created since he approached the intersection are now dead. Those lives were all ethically-valuable and equally worth living, and all those future child worlds are going to miss out on the positive influences of Bob.

Since going left was a relatively arbitrary decision based on the sum of evidence Bob had available, wouldn’t it have been nice if only half of the Bobs created since he first approached the intersection went left, and the other half went right? Then, in case there are any mountain lions or other threats, in at least in half of future child worlds, Bob is still alive.

If going left seemed a good bit more worthy than going right (but not overwhelmingly), perhaps it would be optimal if 80% of Bobs went left, and only 20% went right.

## Ineffective Solutions

If this diversification of future child worlds is optimal, how can Bob coordinate with the other recently created Bobs to diversify outcomes by making different decisions? He can’t just choose to feel very uncertain or subjectively try to feel like doing either one. The human brain is too robust against individual quantum phenomena to ensure Bob reliably makes different choices. Many trillions of quantum operations occur in the brain all the time, yet the brain produces comparatively few high-level decisions, so the brain is robust against most individual quantum phenomena. In most nearby child worlds, Bob probably makes the same decisions, especially when presented with nearly identical sensory stimuli.

Back to how Bob can coordinate with the other ‘copies’ of himself, he cannot even diversify outcomes by flipping a coin, going left if heads and right if tails; in most recently created child worlds, the flip of the coin is too robust against individual quantum operations and lands roughly the same way each time. If he employs the coin method, nearly all the recently made copies of Bob in this scenario will make the same coin flip and then make the same decision.

## My Proposed Solution

The only method that I believe works is to look at individual quantum phenomena that has a particular probability of occurring, such as the radioactive decay of an atom. If Bob has a Geiger counter, he can employ an algorithm that yields a 0 or a 1 bit 50% of the time based off whether the time between 2 consecutive measured radioactive decays is greater than the time between the next two consecutive radioactive decays. If Bob first had the intent to employ this decision procedure as he approached the intersection, nearly all of the Bobs that split off from him also intend to employ this decision procedure.

When it finally comes time for all the copies of him to make a decision, and he happens to feel that going left and right is equally choice-worthy, he can mentally commit to a decision procedure, going left if his true random number is a 0 and right if it is a 1. Then, when he looks to see whether his true-random number generator yields a 0 or a 1, nearly half of the copies of him see a 0 and the other half see a 1. When nearly all the copies of him continue to commit to the decision procedure, 50% of them go right, and 50% go left. Thus, we get a diversity of outcomes, which is intuitively a good thing, although I hope to prove this later.

## Q: What if an agent Charlie is 80% sure that option A is superior to option B?

In that case, Charlie should seek out a random number that is equally likely to be between 1 and 5, inclusive, and go with option A if the generated number is anything but a 5. He can do this by seeking out 3 quantum-generated bits, which create a binary number. As long as the number is between 1 and 5, Charlie can use it. Otherwise, he must discard it. Then, Charlie can base his decision off the first number that qualifies–either 001, 010, 011, 100, or 101. Of note, he only has to discard strictly less than 50% of the numbers with this scheme.

For more precise choice-worthy estimates, one can round one’s subjective probability estimates to arbitrary precision, or employ other encoding schemes.

## Q: But under Many Worlds, we have enough diversity of outcomes that we do not need to worry about deliberately diversifying them, right?

No, I do not think we end up with a proper diversification of outcomes when we apply traditional decision procedures. Not everything that one can imagine happening does happen. I can imagine jumping out the window right now, but it’s entirely possible that in no child worlds, I do this, especially over a finite time period like the next 15 minutes.

Many Worlds doesn’t imply that everything one can imagine happening happens an equal number of times. Rather, Many Worlds implies just that a lot of worlds are created each moment, and that more subjectively probable phenomena, insofar as one is calibrated well, tend to occur in more worlds.

Importantly, “A util is a util” and each instance of suffering and well-being is ethically-relevant, even in a Many-Worlds universe.

## Q: What are the hazards of this decision procedure if Many Worlds isn’t true, or if your conception of Many Worlds is utterly wrong?

I don’t think the hazards are great for the well-calibrated person. If you really would tell only 9 out of 10 copies of yourself to make the same decision, it should not be the end of the world to sometimes be that 10th copy that makes a slightly subjectively suboptimal decision.

For the poorly-calibrated person, you could expect them to make subjectively less optimal decisions more often than they otherwise should. However, I don’t think we need perfect calibration for the Many Worlds to reap the benefits of this decision procedure—slight over-diversification is probably better than none.

## Q: To what sorts of decisions does this decision procedure apply?

I think it should apply to every sort of decision. It would be great for us all to have access to recently-produced, quantum-generated random numbers and employ this procedure to relatively arbitrary and significant phenomena alike. We don’t want you going with the cheesecake and getting food-poising in all future child worlds, nor do we want to see all future child worlds damned to the same outcomes from the decision to retaliate nuclearly.

## Q: What is a practical source of quantum-generated random numbers?

I’m glad you asked! For this procedure, random numbers must be based off quantum phenomena, like radioactive decay or the photoelectric effect, they must be recently generated, and highly preferably, no one else can use them for this decision procedure. HotBits won’t work—I reached out to them—because they send out bits from a pool of entropy which includes random numbers generated from some time ago. The only website that I found that works is hosted by someone at the Australia National University: https://qrng.anu.edu.au/RainBin.php.

If this decision procedure is ever popularized, though, we want a way to make sure different people cannot use the same random numbers for their decision procedures, which would prevent the maximum diversification of child worlds. We would want to use a system like HotBits, which never sends out the same bits twice. We would also want to build and distribute physical quantum random number generators that people could use when they didn’t have internet access.

Thanks for reading! I’m curious what you think of this.

This still doesn’t seem to address why one should be risk-averse and prioritize an impoverished survival in as many branches as possible. (Not that I think it does even that, given your examples; by always taking the risky opportunities with a certain probability, wouldn’t this drive your total quantum measure down rapidly? You seem to have some sort of min-max principle in mind.)

Nor does it deal with the response to the standard ensemble criticism of expected utility: EU-maximization is entirely consistent with non-greedy-EU-maximization strategies (eg the Kelly criterion) as the total-EU-maximizing strategy

ifthe problem, fully modeled, includes considerations like survival or gambler’s ruin (eg in the Kelly coinflip game, the greedy strategy of betting everything each round is one of the worst possible things to do, but EU-maximizing over the entire game does in fact deliver optimal results); however, these do not apply at the quantum level, they only exist at the macro level, and it’s unclear why MWI should make any difference.Meta: this should probably be a link post. (Also, why not cross-post the whole text?)

You make a good point. I fixed it :)