Perplexed comments on Rationality Quotes: November 2010

Perplexed 10 Nov 2010 3:57 UTC
0 points
0

why wouldn’t a CDT agent self-modify to use TDT?

Because it can’t find a write-up that explains how to use it?

Perhaps you can answer the questions that I asked here What play does TDT make in the game of Chicken? Can you point me to a description of TDT that would allow me to answer that question for myself?
- WrongBot 10 Nov 2010 4:26 UTC
  1 point
  0
  Parent
  Suppose I’m an agent implementing TDT. My decision in Chicken depends on how much I know about my opponent.
  - If I know my opponent implements the same decision procedure I do (because I have access to its source code, say), and my opponent has this knowledge about me, I swerve. In this case, my opponent and I are in symmetrical positions and its choice is fully determined by mine; my choice is between payoffs of (0,0) and (-10,-10).
  - Else, I act identically to a CDT agent.
  As Eliezer says here, the one-sentence version of TDT is “Choose as though controlling the logical output of the abstract computation you implement, including the output of all other instantiations and simulations of that computation.”
  - Sniffnoy 10 Nov 2010 4:51 UTC
    2 points
    0
    Parent
    If I know my opponent implements the same decision procedure I do (because I have access to its source code, say), and my opponent has this knowledge about me, I swerve. In this case, my opponent and I are in symmetrical positions and its choice is fully determined by mine; my choice is between payoffs of (0,0) and (-10,-10).
    
    I’m not sure this is right. Isn’t there a correlated equilibrium that does better?
    What links here?
    WrongBot's comment on Rationality Quotes: November 2010 by jaimeastorga2000 (10 Nov 2010 5:25 UTC; 0 points)
    - WrongBot 10 Nov 2010 5:21 UTC
      2 points
      0
      Parent
      I think we’re looking at different payoff matrices. I was using the formulation of Chicken that rewards
      
      # | ….C....|.....D.....
      C | +0, +0 | −1,+1
      D | +1, −1 | −10, −10
      
      which doesn’t have a correlated equilibrium that beats (C,C).
      
      Using the payoff matrix Perplexed posted here, there is indeed a correlated equilibrium, which I believe the TDT agents would arrive at (given a source of randomness). My bad for not specifying the exact game I was talking about.
      - Sniffnoy 10 Nov 2010 7:12 UTC
        0 points
        0
        Parent
        ...and, this is what I get for not actually checking things before I post them.
      - Perplexed 10 Nov 2010 6:12 UTC
        0 points
        0
        Parent
        Two questions:
        
        Why do you believe the TDT agents would find the correlated equilibrium? Your previous statement and Eliezer quote suggested that a pair of TDT agents would always play symmetrically in a symmetric game. No “spontaneous symmetry breaking”.
        Even without a shared random source, there is a Nash mixed equilibrium that is also better than symmetric cooperation. Do you believe TDT would play that if there were no shared random input?
        WrongBot 10 Nov 2010 8:40 UTC
        0 points
        0
        Parent
        In a symmetric game, TDT agents choose symmetric strategies. Without a source of randomness, this entails playing symmetrically as well.
        
        I’m not sure why you’re talking about shared random input. If both agents get the same input, they can both be expected to treat it in the same way and make the same decision, regardless of the input’s source. Each agent needs an independent source of randomness in order to play the mixed equilibrium; if my strategy is to play C 30% of the time, I need to know whether this iteration is part of that 30%, which I can’t do deterministically because my opponent is simulating me.
        
        Sniffnoy 10 Nov 2010 20:38 UTC
        0 points
        0
        Parent
        Yeah, I think any use of correlated equilibrium here is wrong—that requires a shared random source. I think in this case we just get symmetric strategies, i.e., it reduces to superrationality, where they each just get their own private random source.
        Perplexed 10 Nov 2010 15:33 UTC
        0 points
        0
        Parent
        
        I’m not sure why you’re talking about shared random input.
        
        Sorry if this was unclear. It was a reference to the correlated pair of random variables used in a correlated equilibrium. I was saying that even without such a correlated pair, you may presume the availability of independent random variables which would allow a Nash equilibrium—still better than symmetric play in this game.
    - Sniffnoy 10 Nov 2010 20:15 UTC
      0 points
      0
      Parent
      Gah, wait. I feel dumb. Why would TDT find correlated equilibria? I think I had the “correlated equilibrium” concept confused. A correlated equilibrium would require a public random source, which two TDTers won’t have.
      - steven0461 10 Nov 2010 20:23 UTC
        3 points
        0
        Parent
        Digits of pi are kind of like a public random source.
        Sniffnoy 10 Nov 2010 20:41 UTC
        0 points
        0
        Parent
        Ignoring the whole pi-is-not-known-to-be-normal thing, how do you determine which digit of pi to use when you can’t actually communicate and you have no idea how many digits of pi the other player may already know?
        steven0461 10 Nov 2010 20:51 UTC
        0 points
        0
        Parent
        Same way you meet up in New York with someone you’ve never talked to: something like Schelling points. I’m not sure that answer works in practice.
  - Perplexed 10 Nov 2010 4:51 UTC
    1 point
    0
    Parent
    Thank you. I hope you realize that you have provided an example of a game in which CDT does better than TDT. For example, in the game with the payoff matrix shown below, there is a mixed strategy Nash equilibrium which is better than the symmetric cooperative result.
    
    | .C..|....D..
    
    C | 3,3 | 2,7
    D | 7,2 | 0,0
    What links here?
    WrongBot's comment on Rationality Quotes: November 2010 by jaimeastorga2000 (10 Nov 2010 5:21 UTC; 2 points)
    - WrongBot 10 Nov 2010 5:25 UTC
      0 points
      0
      Parent
      Looks like we’re talking about different versions of Chicken. Please see my reply to Sniffnoy.
  - Perplexed 10 Nov 2010 6:12 UTC
    0 points
    0
    Parent
    So TDT is different from CDT only in cases where the game is perfectly symmetric? If you are playing a game that is roughly the symmetric PD, except that one guy’s payoffs are shifted by a tiny +epsilon, then they should both defect?
    - WrongBot 10 Nov 2010 8:26 UTC
      0 points
      0
      Parent
      TDT is different from CDT whenever one needs to consider the interaction of multiple decisions made using the same TDT-based decision procedure. This applies both to competitions between agents, as in the case of Chicken, and to cases where an agent needs to make credible precommitments, as in Newcomb’s Problem.
      
      In the case of an almost-symmetric PD, the TDT agents should still cooperate. To change that, you’d have to make the PD asymmetrical enough that the agents were no longer evaluating their options in the same way. If a change is small enough that a CDT agent wouldn’t change its strategy, TDT agents would also ignore it.
      
      This doesn’t strike me as the world’s greatest explanation, but I can’t think of a better way to formulate it. Please let me know if there’s something that’s still unclear.
      - Perplexed 10 Nov 2010 15:56 UTC
        0 points
        0
        Parent
        
        If a change is small enough that a CDT agent wouldn’t change its strategy, TDT agents would also ignore it.
        
        This strikes me as a bit bizarre. You test whether a warped PD is still close enough to symmetric by asking whether a CDT agent still defects in order to decide whether a TDT agent should still cooperate? Are you sure you are not just making up these rules as you go?
        
        Please let me know if there’s something that’s still unclear.
        
        Much is unclear and very little seems to be coherently written down. What amazes me is that there is so much confidence given to something no one can explain clearly. So far, the only stable thing in your description of TDT is that it is better than CDT.

Perplexed comments on Rationality Quotes: November 2010

| .C..|....D..