Shmi comments on Robust Cooperation in the Prisoner’s Dilemma

Shmi 6 Jun 2013 23:51 UTC
2 points
0

the mistakes are of the form “getting (D,D) when I could have gotten (C,C)” and “getting (C,C) when I could have gotten (D,C)”.

Interesting. How hard is it to construct this anti-prudent bot?

Relatedly, so the only way for PrudentBot to lose an open-source tournament is when there are many imprudent entries against which some other algorithm can get (D,C) where PrudentBot gets (C,C)?
- Viliam_Bur 7 Jun 2013 10:25 UTC
  10 points
  0
  Parent
  You can’t get more points than PrudentBot in one-to-one combat, but you can try doing better against the environment (other players).
  
  There is always an environment where agent X gets more points than PrudentBot. Specifically, an environment of ServantOfX bots that recognize the source code of X, cooperate with X, and defect against everyone else.
  
  Technically, there is a difference between “maximizing your score” and “beating PrudentBot”. (Focusing on “beating PrudentBot” makes it a zero-sum game.) If you know in advance that there will be more copies of you than copies of PrudentBot, you could just decide to defect against PrudentBot, which will harm it more than it will harm you. But it will harm you. You will defeat PrudentBot, but get less utility than you could get otherwise. (You will prove suboptimality of PrudentBot by becoming suboptimal yourself.)
  - Decius 8 Jun 2013 2:20 UTC
    0 points
    0
    Parent
    Actually, you prove that there does not exist an optimal strategy for the game “Score more points that everybody else in a given competition.” The game “Score the maximum number of points possible” is subtly different.
    - Viliam_Bur 9 Jun 2013 14:45 UTC
      6 points
      0
      Parent
      The “maximum number of points” optimizing strategy maximizes the number of points in some average environment. In a sufficiently perverse environment, it will score low.
      
      So, perhaps the question is how to define the “average” environment. Is it an environment containing bots with probability corresponding to their (Kolmogorov) complexity of code?
      - Decius 9 Jun 2013 23:47 UTC
        0 points
        0
        Parent
        I was going to say that there was a program which would score more than any other program in any given environment.
        
        But when I write that out, it’s trivial to falsify.
        
        There does not exist a strategy which is dominant in all environments.