After reading http://www.pnas.org/content/early/2012/05/16/1206569109.full.pdf+html , the detail that’s caught my attention: “The player with the shortest memory sets the terms of the game.” If a strategy remembers 0 turns, and simply Always Cooperates, or Always Defects, or randomly chooses between them, then no matter how clever its opponent might be, it can’t do any better than by acting as if it were also a Memory-0 strategy. Tit-for-Tat is a Memory-1 strategy—and despite all the analysis that I’ve read on it before, I now see it from a new perspective, in that it’s one of the few Memory-1 strategies that gracefully falls back to the appropriate Memory-0 strategy when faced with All-C or All-D… and any strategy which tries to implement a more complicated scheme based on longer strings is faced with the fact that Tit-for-Tat simply doesn’t remember anything beyond a single turn.
I would like to see if this perspective can be extended to a Memory-2 strategy that falls gracefully back to appropriate Memory-1 strategies such as Tit-for-Tat when faced with Memory-1 strategies, and like Tit-for-Tat, to a suitable Memory-0 strategy when faced with Memory-0 ones.
Does anyone have a link to a suitable set of programs to run some experimental tourneys, and instructions on how to apply them? (If it matters, the OSes I have available are WinXP and Fedora 21.)
This is a neat question, but I think programs being successful is not really about gracefully going down a hierarchy. For example, Tit-for-Tat does not take the correct strategy against always-cooperate (If your opponent is always cooperating, you say thank you and always defect). Tit-for-Tat succeeds for much more ecological reasons. I’d say bigger-memory versions of Tit-for-Tat are going to be something like the class of “peaceful, non-exploitable” strategies. Such strategies are not going to be the first to defect, which means they actually don’t get that much information about their opponent. I think the lesson of iterated prisoners dilemmas is that you don’t need that information anyhow, as long as your strategy occupies a good ecological niche.
There’s still some subtlety here. A Memory-0 strategy picks C with probability p and D with probability q, independent of any past results. If you know p and q, you can devise a strategy to optimize your score. The result in the paper is that this new strategy is Memory-0 and that you can’t do better by increasing your memory.
The advantage of a longer memory is that, given enough iterations, you can get a good approximation for p and q and so deduce the appropriate Memory-0 strategy. Something like Tit-for-Tat is devised to basically get the same score as its opponent (the opponent can get an advantage of one defection). It’s not going to do worse than any individual opponent, but neither is it going to do better. A strategy that remembers the entire game can recognize, say, All-C and exploit it by defecting, which Tit-for-Tat can’t do.
A Memory-1 strategy is one where p and q are functions of the previous round. In general, they’ll depend both on what it did last round and what the opponent did last round. There are four possible results (C-C, C-D, D-C, D-D), which means that the strategy will have up to four distinct probabilities for cooperation next round. If you can learn those, you can come up with the optimal strategy for playing against it. This strategy can be modeled as a Memory-1 strategy.
The big difference, I think, is that having a longer memory is helpful if you’re in a diverse environment. In any individual game, there’s always a strategy with a shorter memory that will do as well as yours. However, the same short-memory strategy will not be optimal against every opponent, while you can use your longer memory to devise the best short-memory strategy for a given match.
[Tit for Tat is] one of the few Memory-1 strategies that gracefully falls back to the appropriate Memory-0 strategy when faced with All-C or All-D.
I am not clear on how this is the case. It seems to me that the appropriate strategy when faced with any Memory-0 strategy is to go All-D, since your defections would optimize your own score while having no influence on the future behavior of your opponent. Tit for Tat does not default to All-D unless the opponent is All-D.
All-D is the /optimal/ strategy for Memory-0 - but if your goal for Memory-0 interactions is merely to avoid getting the Sucker’s payoff, and /also/ to be able to deal with Memory-1 strategies, then defaulting to All-C versus All-C isn’t that bad a compromise.
Still More to the Prisoner’s Dilemma
After reading http://www.pnas.org/content/early/2012/05/16/1206569109.full.pdf+html , the detail that’s caught my attention: “The player with the shortest memory sets the terms of the game.” If a strategy remembers 0 turns, and simply Always Cooperates, or Always Defects, or randomly chooses between them, then no matter how clever its opponent might be, it can’t do any better than by acting as if it were also a Memory-0 strategy. Tit-for-Tat is a Memory-1 strategy—and despite all the analysis that I’ve read on it before, I now see it from a new perspective, in that it’s one of the few Memory-1 strategies that gracefully falls back to the appropriate Memory-0 strategy when faced with All-C or All-D… and any strategy which tries to implement a more complicated scheme based on longer strings is faced with the fact that Tit-for-Tat simply doesn’t remember anything beyond a single turn.
I would like to see if this perspective can be extended to a Memory-2 strategy that falls gracefully back to appropriate Memory-1 strategies such as Tit-for-Tat when faced with Memory-1 strategies, and like Tit-for-Tat, to a suitable Memory-0 strategy when faced with Memory-0 ones.
Does anyone have a link to a suitable set of programs to run some experimental tourneys, and instructions on how to apply them? (If it matters, the OSes I have available are WinXP and Fedora 21.)
This is a neat question, but I think programs being successful is not really about gracefully going down a hierarchy. For example, Tit-for-Tat does not take the correct strategy against always-cooperate (If your opponent is always cooperating, you say thank you and always defect). Tit-for-Tat succeeds for much more ecological reasons. I’d say bigger-memory versions of Tit-for-Tat are going to be something like the class of “peaceful, non-exploitable” strategies. Such strategies are not going to be the first to defect, which means they actually don’t get that much information about their opponent. I think the lesson of iterated prisoners dilemmas is that you don’t need that information anyhow, as long as your strategy occupies a good ecological niche.
There’s still some subtlety here. A Memory-0 strategy picks C with probability p and D with probability q, independent of any past results. If you know p and q, you can devise a strategy to optimize your score. The result in the paper is that this new strategy is Memory-0 and that you can’t do better by increasing your memory.
The advantage of a longer memory is that, given enough iterations, you can get a good approximation for p and q and so deduce the appropriate Memory-0 strategy. Something like Tit-for-Tat is devised to basically get the same score as its opponent (the opponent can get an advantage of one defection). It’s not going to do worse than any individual opponent, but neither is it going to do better. A strategy that remembers the entire game can recognize, say, All-C and exploit it by defecting, which Tit-for-Tat can’t do.
A Memory-1 strategy is one where p and q are functions of the previous round. In general, they’ll depend both on what it did last round and what the opponent did last round. There are four possible results (C-C, C-D, D-C, D-D), which means that the strategy will have up to four distinct probabilities for cooperation next round. If you can learn those, you can come up with the optimal strategy for playing against it. This strategy can be modeled as a Memory-1 strategy.
The big difference, I think, is that having a longer memory is helpful if you’re in a diverse environment. In any individual game, there’s always a strategy with a shorter memory that will do as well as yours. However, the same short-memory strategy will not be optimal against every opponent, while you can use your longer memory to devise the best short-memory strategy for a given match.
I am not clear on how this is the case. It seems to me that the appropriate strategy when faced with any Memory-0 strategy is to go All-D, since your defections would optimize your own score while having no influence on the future behavior of your opponent. Tit for Tat does not default to All-D unless the opponent is All-D.
All-D is the /optimal/ strategy for Memory-0 - but if your goal for Memory-0 interactions is merely to avoid getting the Sucker’s payoff, and /also/ to be able to deal with Memory-1 strategies, then defaulting to All-C versus All-C isn’t that bad a compromise.