paulfchristiano comments on EDT with updating double counts

paulfchristiano 13 Oct 2021 21:57 UTC
6 points
0
I’m using EDT to mean the agent that calculates expected utility conditioned on each statement of the form “I take action A” and then chooses the action for which the expected utility is highest. I’m not sure what you mean by saying the utility is not a function of O_i, isn’t “how much money me and my copies earn” a function of the outcome?
(In your formulation I don’t know what P(|A) means, given that A is an action and not an event, but if I interpret it as “Probability given that I take action A” then it looks like it’s basically what I’m doing?)
- JBlack 15 Oct 2021 12:06 UTC
  1 point
  −2
  Parent
  The “me and my copies” that this agent bases its utility on are split across possible worlds with different outcomes. EDT requires a function that maps an action and an outcome to a utility value, and no such function exists for this agent.
  Edit: as an example, what is the utility of this agent winning $1000 in a game where they don’t know the chance of winning? They don’t even know themselves what their own utility is, because their utility doesn’t just depend upon the outcome. If you credibly tell them afterward that they were nearly certain to win, they value the same $1000 very much greater than if you tell them that there was a 1 in a million chance that they would win.
  For this sort of agent that values nonexistent and causally-disconnected people, we need some different class of decision theory altogether, and I’m not sure it can even be made rationally consistent.
  - abramdemski 8 Jul 2025 16:04 UTC
    2 points
    0
    Parent
    From the OP:
    We live in a very big universe where many copies of me all face the exact same decision. This seems plausible for a variety of reasons; the best one is accepting an interpretation of quantum mechanics without collapse (a popular view).
    The copies in almost all of the decision problems mentioned are spread out across a big world, not across “possible worlds”. EG:
    If both agents exist and they are just in separate worlds, then there is no conflict between their values at all, and they always push the button.
    “Worlds” here means literal planets, rather than the “possible worlds” of philosophy. Hence, it can all be accommodated in one big outcome.
    The one exception to this I’m noticing is the final case mentioned:
    Suppose that only one agent exists. Then it feels weird, seeing button “B,” to press the button knowing that it causes you to lose $1 in the real, actually-existing world. But in this case I think the problem comes from the sketchy way we’re using the word “exist”—if copy B gets money based on copy A’s decision, then in what sense exactly does copy A “not exist”? What are we to make of the version of copy A who is doing the same reasoning, and is apparently wrong about whether or not they exist? I think these cases are confusing from a misuse of “existence” as a concept rather than updatelessness per se.
    However, the text is obviously noting that there is something off about this case.
    I admit that it is common, in discussion of UDT, to let the outcome be a function of the full policy, including actions taken in alternate possible worlds (even sometimes including impossible possible worlds, IE, contradictory worlds). However, it can always be interpreted as some kind of simulation taking place within the actual world (usually, in Omega’s imagination).