the mistakes are of the form “getting (D,D) when I could have gotten (C,C)” and “getting (C,C) when I could have gotten (D,C)”.
Interesting. How hard is it to construct this anti-prudent bot?
Relatedly, so the only way for PrudentBot to lose an open-source tournament is when there are many imprudent entries against which some other algorithm can get (D,C) where PrudentBot gets (C,C)?
You can’t get more points than PrudentBot in one-to-one combat, but you can try doing better against the environment (other players).
There is always an environment where agent X gets more points than PrudentBot. Specifically, an environment of ServantOfX bots that recognize the source code of X, cooperate with X, and defect against everyone else.
Technically, there is a difference between “maximizing your score” and “beating PrudentBot”. (Focusing on “beating PrudentBot” makes it a zero-sum game.) If you know in advance that there will be more copies of you than copies of PrudentBot, you could just decide to defect against PrudentBot, which will harm it more than it will harm you. But it will harm you. You will defeat PrudentBot, but get less utility than you could get otherwise. (You will prove suboptimality of PrudentBot by becoming suboptimal yourself.)
Actually, you prove that there does not exist an optimal strategy for the game “Score more points that everybody else in a given competition.” The game “Score the maximum number of points possible” is subtly different.
The “maximum number of points” optimizing strategy maximizes the number of points in some average environment. In a sufficiently perverse environment, it will score low.
So, perhaps the question is how to define the “average” environment. Is it an environment containing bots with probability corresponding to their (Kolmogorov) complexity of code?
Interesting. How hard is it to construct this anti-prudent bot?
Relatedly, so the only way for PrudentBot to lose an open-source tournament is when there are many imprudent entries against which some other algorithm can get (D,C) where PrudentBot gets (C,C)?
You can’t get more points than PrudentBot in one-to-one combat, but you can try doing better against the environment (other players).
There is always an environment where agent X gets more points than PrudentBot. Specifically, an environment of ServantOfX bots that recognize the source code of X, cooperate with X, and defect against everyone else.
Technically, there is a difference between “maximizing your score” and “beating PrudentBot”. (Focusing on “beating PrudentBot” makes it a zero-sum game.) If you know in advance that there will be more copies of you than copies of PrudentBot, you could just decide to defect against PrudentBot, which will harm it more than it will harm you. But it will harm you. You will defeat PrudentBot, but get less utility than you could get otherwise. (You will prove suboptimality of PrudentBot by becoming suboptimal yourself.)
Actually, you prove that there does not exist an optimal strategy for the game “Score more points that everybody else in a given competition.” The game “Score the maximum number of points possible” is subtly different.
The “maximum number of points” optimizing strategy maximizes the number of points in some average environment. In a sufficiently perverse environment, it will score low.
So, perhaps the question is how to define the “average” environment. Is it an environment containing bots with probability corresponding to their (Kolmogorov) complexity of code?
I was going to say that there was a program which would score more than any other program in any given environment.
But when I write that out, it’s trivial to falsify.
There does not exist a strategy which is dominant in all environments.