gwern comments on The Anthropic Trilemma

gwern 10 Oct 2009 3:26 UTC
3 points
You know, your example is actually making that horn look more attractive: replace the torture to the person with ’50000 utilities subtracted from the cosmos’, etc, and then it’s obvious that the green-room is no grounds for relief since the −50000 is still a fact. More narrowly, if you valued other persons equal to yourself, then the green room is definitely no cause for relief.

We could figure out how much you value other people by varying how bad the torture, is and maybe adding a deal where if the green-room person will flip a fair coin (heads, the punishment is swapped; tails, no change), the torture is lessened by n. If you value the copy equal to yourself, you’ll be willing to swap for any difference right down to 1, since if it’s tails, there’s no loss or gain, but if it’s heads, there’s a n profit.

Now, of course even if the copy is identical to yourself, and even if we postulate that somehow the 2 minds haven’t diverged (we could do this by making the coinflip deal contingent on being the tortured one − 2 identical rooms, neither of which knows whether they are the tortured one; by making it contingent, there’s no risk in not taking the bet), I think essentially no human would take the coinflip for just +1 - they would only take it if there was a major amelioration of the torture. Why? Because pain is so much realer and overriding to us, which is a fact about us and not about agents we can imagine.

(If you’re not convinced, replace the punishments with rewards and modify the bet to increase the reward but possibly switch it to the other fellow; and imagine a parallel series of experiments being run with rational agents who don’t have pain/greed. After a lot of experiments, who will have more money?)
- loqi 10 Oct 2009 20:38 UTC
  1 point
  Parent
  
  More narrowly, if you valued other persons equal to yourself, then the green room is definitely no cause for relief.
  
  Yes, and this hypothesis can even be weakened a bit, since the other persons involved are nearly identical to you. All it takes is a sufficiently “fuzzy” sense of self.
  
  Now, of course even if the copy is identical to yourself, and even if we postulate that somehow the 2 minds haven’t diverged [...] I think essentially no human would take the coinflip for just +1 - they would only take it if there was a major amelioration of the torture.
  
  To clarify what you mean by “haven’t diverged”… does that include the offer of the flip? E.g., both receive the offer, but only one of the responses “counts”? Because I can’t imagine not taking the flip if I knew I was in such a situation… my anticipation would be cleanly split between both outcomes due to indexical uncertainty. It’s a more complicated question once I know which room I’m in.
  - gwern 10 Oct 2009 21:17 UTC
    0 points
    Parent
    
    To clarify what you mean by “haven’t diverged”… does that include the offer of the flip? E.g., both receive the offer, but only one of the responses “counts”? Because I can’t imagine not taking the flip if I knew I was in such a situation… my anticipation would be cleanly split between both outcomes due to indexical uncertainty. It’s a more complicated question once I know which room I’m in.
    
    Well, maybe I wasn’t clear. I’m imagining that there are 2 green rooms, say, however, one room has been secretly picked out for the torture and the other gets the dustspeck.
    
    Each person now is made the offer: if you flip this coin, and you are not the torture room, the torture will be reduced by n and the room tortured may be swapped if the coin came up heads; however, if you are the torture room, the coin flip does nothing.
    
    Since the minds are the same, in the same circumstances, with the same offer, we don’t need to worry about what happens if the coins fall differently or if one accepts and the other rejects. The logic they should follow is: if I am not the other, then by taking the coin flip I am doing myself a disservice by risking torture, and I gain under no circumstance and so should never take the bet; but if I am the other as well, then I lose under no circumstance so I should always take the bet.
    
    (I wonder if I am just very obtusely reinventing the prisoner’s dilemma or newcomb’s paradox here, or if by making the 2 copies identical I’ve destroyed an important asymmetry. As you say, if you don’t know whether “you” have been spared torture, then maybe the bet does nothing interesting.)
    - loqi 11 Oct 2009 5:24 UTC
      0 points
      Parent
      
      The logic they should follow is: if I am not the other, then by taking the coin flip I am doing myself a disservice by risking torture, and I gain under no circumstance and so should never take the bet; but if I am the other as well, then I lose under no circumstance so I should always take the bet.
      
      I’m not sure what “not being the other” means here, really. There may be two underlying physical processes, but they’re only giving rise to one stream of experience. From that stream’s perspective, its future is split evenly between two possibilities, so accepting the bet strictly dominates. Isn’t this just straightforward utility maximization?
      
      The reason the question becomes more complicated if the minds diverge is that the concept of “self” must be examined to see how the agent weights the experiences of an extremely similar process in its utility function. It’s sort of a question of which is more defining: past or future. A purely forward-looking agent says “ain’t my future” and evaluates the copy’s experiences as those of a stranger. A purely backward-looking agent says “shares virtually my entire past” and evaluates the copy’s experiences as though they were his own. This all assumes some coherent concept of “selfishness”—clearly a purely altruistic agent would take the flip.
      
      I wonder if I am just very obtusely reinventing the prisoner’s dilemma or newcomb’s paradox here, or if by making the 2 copies identical I’ve destroyed an important asymmetry.
      
      The identical copies scenario is a prisoner’s dilemma where you make one decision for both sides, and then get randomly assigned to a side. It’s just plain crazy to defect in a degenerate prisoner’s dilemma against yourself. I think this does destroy an important asymmetry—in the divergent scenario, the green-room agent knows that only his decision counts.
      
      Speaking for my own values, I’m still thoroughly confused by the divergent scenario. I’d probably be selfish enough not to take the flip for a stranger, but I’d be genuinely unsure of what to do if it was basically “me” in the red room.