Stuart_Armstrong comments on Anthropic decision theory for selfish agents

Stuart_Armstrong 22 Oct 2014 14:46 UTC
1 point
The broader question is “does bringing in gnomes in this way leave the initial situation invariant”? And I don’t think it does. The gnomes follow their own anthropic setup (though not their own preferences), and their advice seems to reflect this fact (consider what happens when the heads world has 1, 2 or 50 gnomes, while the tails world has 2).

I also don’t see your indexical objection. The sleeping beauty could perfectly have an indexical version of total utilitarianism (“I value my personal utility, plus that of the sleeping beauty in the other room, if they exist”). If you want to proceed further, you seem to have to argue that indexical total utilitarianism gives different decisions than standard total utilitarianism.

This is odd, as it seems a total utilitarian would not object to having their utility replaced with the indexical version, and vice-versa.
- Beluga 22 Oct 2014 19:53 UTC
  1 point
  Parent
  
  The broader question is “does bringing in gnomes in this way leave the initial situation invariant”? And I don’t think it does. The gnomes follow their own anthropic setup (though not their own preferences), and their advice seems to reflect this fact (consider what happens when the heads world has 1, 2 or 50 gnomes, while the tails world has 2).
  
  As I wrote (after your comment) here, I think it is prima facie very plausible for a selfish agent to follow the gnome’s advice if a) conditional on the agent existing, the gnome’s utility function agrees with the agent’s and b) conditional on the agent not existing, the gnome’s utility function is a constant. (I didn’t have condition b) explicitly in mind, but your example showed that it’s necessary.) Having the number of gnomes depend upon the coin flip invalidates their purpose. The very point of the gnomes is that from their perspective, the problem is not “anthropic”, but a decision problem that can be solved using UDT.
  
  I also don’t see your indexical objection. The sleeping beauty could perfectly have an indexical version of total utilitarianism (“I value my personal utility, plus that of the sleeping beauty in the other room, if they exist”). If you want to proceed further, you seem to have to argue that indexical total utilitarianism gives different decisions than standard total utilitarianism.
  
  That’s what I tried in the parent comment. To be clear, I did not mean “indexical total utilitarianism” to be a meaningful concept, but rather a wrong way of thinking, a trap one can fall into. Very roughly, it corresponds to thinking of total utilitarianism as “I care for myself plus any other people that might exist” instead of “I care for all people that exist”. What’s the difference, you ask? A minimal non-anthropic example that illustrates the difference would be very much like the incubator, but without people being created. Imagine 1000 total utilitarians with identical decision algorithms waiting in separate rooms. After the coin flip, either one or two of them are offered to buy a ticket that pays $1 after heads. When being asked, the agents can correctly perform a non-anthropic Bayesian update to conclude that the probability of tails is ²⁄₃. An indexical total utilitarian reasons: “If the coin has shown tails, another agent will pay the same amount $x that I pay and win the same $1, while if the coin has shown heads, I’m the only one who pays $x. The expected utility of paying $x is thus ¹⁄₃ (-x) + ²⁄₃ 2 * (1-x).” This leads to the incorrect conclusion that one should pay up to $4/5. The correct (UDT-) way to think about the problem is that after tails, one’s decision algorithm is called twice. There’s only one factor of 2, not two of them. This is all very similar to this post.
  
  To put this again into context: You argued that selfishness is a ⁵⁰⁄₅₀ mixture of hating the other person, if another person exists, and total utilitarianism. My reply was that this is only true if one understands total utilitarianism in the incorrect, indexical way. I formalized this as follows: Let the utility function of a hater be vh—h vo (here, vh is the agent’s own utility, vo the other person’s utility, and h is 1 if the other person exists and 0 otherwise). Selfishness would be a ⁵⁰⁄₅₀ mixture of hating and total utilitarianism if the utility function of a total utilitarian were vh + h vo. However, this is exactly the wrong way of formalizing total utilitarianism. It leads, again, to the conclusion that a total utilitarian should pay up to $4/5.
  What links here?
  - Beluga's comment on Anthropic decision theory for selfish agents by Beluga (22 Oct 2014 20:02 UTC; 1 point)
  - Stuart_Armstrong 23 Oct 2014 11:40 UTC
    2 points
    Parent
    
    A minimal non-anthropic example that illustrates the difference
    
    The decision you describe in not stable under pre-commitments. Ahead of time, all agents would pre-commit to the $2/3. Yet they seem to change their mind when presented with the decision. You seem to be double counting, using the Bayesian updating once and the fact that their own decision is responsible for the other agent’s decision as well.
    
    In the terminology of paper http://www.fhi.ox.ac.uk/anthropics-why-probability-isnt-enough.pdf , your agents are altruists using linked decisions with total responsibility and no precommitments, which is a foolish thing to do. If they were altruists using linked decisions with divided responsibility (or if they used precommitments), everything would be fine (I don’t like or use that old terminology—UDT does it better—but it seems relevant here).
    
    But that’s detracting from the main point: still don’t see any difference between indexical and non-indexical total utilitarianism. I don’t see why a non-indexical total utilitarian can’t follow the wrong reasoning you used in your example just as well as an indexical one, if either of them can—and similarly for the right reasoning.
    - Beluga 24 Oct 2014 15:47 UTC
      0 points
      Parent
      
      The decision you describe in not stable under pre-commitments. Ahead of time, all agents would pre-commit to the $2/3. Yet they seem to change their mind when presented with the decision. You seem to be double counting, using the Bayesian updating once and the fact that their own decision is responsible for the other agent’s decision as well.
      
      Yes, this is exactly the point I was trying to make—I was pointing out a fallacy. I never intended “lexicality-dependent utilitarianism” to be a meaningful concept, it’s only a name for thinking in this fallacious way.