dankane comments on Timeless Decision Theory: Problems I Can’t Solve

dankane 19 Dec 2010 21:51 UTC
−1 points
0
Actually here’s my argument why (ignoring the simulation arguments) you should actually refuse to give Omega money.

Here’s what actually happened:

Omega flipped a fair coin. If it comes up heads the stated conversation happened. If it comes up tails and Omega predicts that you would have given him $1000, he steals $1000000 from you.

If you have a policy of paying you earn 10^6/4 − 10^3/4 −10^6/2 = -$250250. If you have a policy of not paying you get 0.

More realistically having a policy of paying Omega in such a situation could earn or lose you money if people interact with you based on a prediction of your policy, but there is no reason to suspect one over the other.

There’s a similar problem with the Prisoner’s Dilemma solution. If you formalize it as two of you are in a Prisoner’s Dilemma and can see each other’s code, then modifying your code to cooperate against the mirror matchup helps you in the mirror matchup, but hurts you if you are playing against a program that cooperates unless you would cooperate in a mirror matchup. Unless you have a reason to suspect that running into one is more likely than running into the other, you can’t tell which would work better.
- Desrtopa 20 Dec 2010 5:13 UTC
  2 points
  0
  Parent
  Unless I’m misunderstanding you, this is a violation of one of the premises of the problem, that Omega is known to be honest about how he poses dilemmas.
  - dankane 20 Dec 2010 5:48 UTC
    1 point
    0
    Parent
    Fine if you think that Omega would have told me about the previous coin flip consider this:
    
    There are two different supernatural entities who can correctly predict my response to the counterfactual mugging. There’s Omega and Z.
    
    Two things could theoretically happen to me:
    
    a) Omega could present me with the counterfactual mugging problem.
    
    b) Z could decide to steal $1000000 from me if and only if I would have given Omega $1000 in the counterfactual mugging.
    
    When I am trying to decide on policy for dealing with counterfactual muggings I should note that my policy will affect my outcome in both situation (a) and (b). The policy of giving Omega money will win me $499500 (expected) in situation (a), but it will lose me $1000000 in situation (b). Unless I have a reason to suspect that (a) is at least twice as likely as (b), I have no reason to prefer the policy of giving Omega money.
    - Desrtopa 20 Dec 2010 14:12 UTC
      3 points
      0
      Parent
      The basis of the dilemma is that you know that Omega, who is honest about the dilemmas he presents, exists. You have no evidence that Z exists. You can posit his existence, but it doesn’t make the dilemma symmetrical.
      - dankane 20 Dec 2010 16:26 UTC
        2 points
        0
        Parent
        But if instead Z exists, shows up on your doorstep and says (in his perfectly trustworthy way) “I will take your money if and only if you would have given money to Omega in the counterfactual mugging”, then you have evidence that Z exists but no evidence that Omega does.
        
        The point is that you need to make your policy before either entity shows up. Therefore unless you have evidence now that one is more likely than the other, not paying Omega is the better policy (unless you think of more hypothetical entities).
    - Caspian 20 Dec 2010 12:09 UTC
      0 points
      0
      Parent
      Agreed. Neither is likely to happen, but the chance of something analogous happening may be relevant when forming a general policy. Omega in Newcombe’s problem is basically asking you to guard something for pay without looking at it or stealing it. The unrealistic part is being a perfect predictor and perfectly trustworthy and you therefore knowing the exact situation.
      
      Is there a more everyday analogue to Omega as the Counterfactual Mugger?
      - shokwave 20 Dec 2010 12:35 UTC
        2 points
        0
        Parent
        
        Is there a more everyday analogue to Omega as the Counterfactual Mugger?
        
        People taking bets for you in your absence.
        
        It’s probably a good exercise to develop a real-world analogue to all philosophical puzzles such as this wherever you encounter them; the purpose of such thought experiments is not to create entirely new situations, but to strip away extraneous concerns and heuristics like “but I trust my friends” or “but nobody is that cold-hearted” or “but nobody would give away a million dollars for the hell of it, there must be a trick”.
        dankane 20 Dec 2010 16:40 UTC
        1 point
        0
        Parent
        Good point. On the other hand I think that Omega being a perfect predictor through some completely unspecified mechanism is one of the most confusing parts of this problem. Also as I was saying, it is also complicating issue that you do not know anything about the statistical behavior of possible Omegas (though I guess that there are ways to fix that in the problem statement).
        shokwave 21 Dec 2010 0:28 UTC
        1 point
        0
        Parent
        
        I think that Omega being a perfect predictor through some completely unspecified mechanism is one of the most confusing parts of this problem.
        
        It may be a truly magical power, but any other method of stipulating better-than-random prediction has a hole in it that lets people ignore the actual decision in favor of finding a method to outsmart said prediction method. Parfit’s Hitchhiker, as usually formalised on LessWrong, involves a more believable good-enough lie-detector—but prediction is much harder than lie-detection, we don’t have solid methods of prediction that aren’t gameable, and so forth, until it’s easier to just postulate Omega to get people to engage with the decision instead of the formulation.
        dankane 21 Dec 2010 8:15 UTC
        0 points
        0
        Parent
        Now if the method of prediction were totally irrelevant, I think I would agree with you. On the other hand, method of prediction can be the difference between your choice directly putting the money in the box in Newcomb’s problem and a smoking lesion problem. If the method of prediction is relevant, than requiring an unrealistic perfect predictor is going to leave you with something pretty unintuitive. I guess that a perfect simulation or a perfect lie detector would be reasonable though. On the other hand outsmarting the prediction method may not be an option. Maybe they give you a psychology test, and only afterwords offer you a Newcomb problem. In any case I feel like confusing bits of problem statement are perhaps just being moved around.
        wedrifid 20 Dec 2010 16:56 UTC
        0 points
        0
        Parent
        
        Also as I was saying about it is a complicating issue that you do not know anything about the statistical behavior of possible Omegas (though I guess that there are ways to fix that in the problem statement).
        
        There is one Omega, Omega and Newcomb’s problem gives his profile!
  - wedrifid 20 Dec 2010 5:27 UTC
    0 points
    0
    Parent
    And he makes a similar mistake in his consideration of the Prisoner’s Dilemma. The prisoners are both attempted to maximise their (known) utility function. You aren’t playing against an actively malicious agent out to steal people’s lunch. You do have reason to expect agents to be more likely to follow their own self interest than not, even in cases where this isn’t outright declared as part of the scenario.
    - dankane 20 Dec 2010 5:52 UTC
      0 points
      0
      Parent
      Here I’m willing to grant a little more. I still claim that whether or not cooperating in the mirror match is a good strategy depends on knowing statistical information about the other players you are likely to face. On the other hand in this case, you may well have more reasonable grounds for your belief that you will see more mirror matches than matches against people who specifically try to punish those who cooperate in mirror matches.
- dankane 21 Jan 2011 20:24 UTC
  0 points
  0
  Parent
  Having thought about it a little more, I think I have pinpointed my problem with building a decision theory in which real outcomes are allowed to depend on the outcomes of counterfactuals:
  
  The output of your algorithm in a given situation will need to depend on your prior distribution and not just on your posterior distribution.
  
  In CDT, your choice of actions depends only on the present state of the universe. Hence you can make your decision based solely on your posterior distribution on the present state of the universe.
  
  If you need to deal with counterfactuals though, the output of your algorithm in a given situation should depend not only on the state of the universe in that situation, but on the probability that this situation appears in a relevant counterfactual and upon the results thereof. I cannot just consult my posterior and ask about the expected results of my actions. I also need to consult my prior and compute the probability that my payouts will depend on a counterfactual version of this situation.