I failed to understand why you can’t arrive at a solution for the Single-Shot game via Iterated Play without memory of the previous game. In order to clarify my ideas let me define two concepts first:
Iterated Play with memory: We repeatedly play the game knowing the results of the previous games.
Iterated Play without memory: We repeatedly play the game, while having no memory of the previous play.
The distinction is important: With memory we can at any time search all previous games and act accordingly, allowing for strategies such as Tit-for-Tat and other history dependent strategies. Without memory we can still learn ( for example by applying some sort of Bayesian updates to our probability estimates of each move being played ), whilst not having access to the previous games before each move. That way we can “learn” how to best play the single shot version of the game by iterated play.
Does what I said above need any clarification, and is there any failure in its’ logic?
If you have no memory, how can you learn? I recognize that you can draw a formal distinction, allowing learning without allowing the strategies being learned to depend on the previous games. But, you are still allowing the agent itself to depend on the previous games, which means that “learning” methods wich bake in more strategy will perform better. For example, a learning method could learn to always go straight in a game of chicken by checking to see whether going straight causes the other player to learn to swerve. IE, it doesn’t seem like a principled distinction.
Furthermore, I don’t see the motivation for trying to do well in a single-shot game via iterated play. What kind of situation is it trying to model? This is discussed extensively in the paper I mentioned in the post, “If multi-agent learning is the answer, what is the question?”
The definition may not be principled, but there’s something that feels a little bit right about it in context. There are various ways to “stay in the logical past” which seem similar in spirit to migueltorrescosta’s remark, like calculating your opponent’s exact behavior but refusing to look at certain aspects of it. The proposal, it seems, is to iterate already-iterated games by passing more limited information of some sort between the possibly-infinite sessions. (Both your and the opponent’s memory gets limited.) But if we admit that Miguel’s “iterated play without memory” is iterated play, well, memory could be imperfect in varied ways at every step, giving us a huge mess instead of well-defined games and sessions. But, that mess looks more like logical time at least.
Not having read the linked paper yet, the motivation for using iterated or meta-iterated play is basically to obtain a set of counterfactuals which will be relevant during real play. Depending on the game, it makes sense that this might be best accomplished by occasionally resetting the opponent’s memory.
I have been thinking a bit about evolutionarily stable equilibria, now. Two things seem interesting (perhaps only as analogies, not literal applications of the evolutionarily stable equilibria concept):
The motivation for evolutionary equilibria involves dumb selection, rather than rational reasoning. This cuts the tricky knots of recursion. It also makes the myopic learning, which only pays attention to how well things perform in of round, seem more reasonable. Perhaps there’s something to be said about rational learning algorithms needing to cut the knots of recursion somehow, such that the evolutionary equilibrium concept holds a lesson for more reflective agents.
The idea of evolutionary stability is interesting because it mixes the game and the metagame together a little bit: the players should do what is good for them, but the resulting solution should also be self-enforcing, which means consideration is given to how the solution shapes the future dynamics of learning. This seems like a necessary feature of a solution.
Thank you for your post abramdemski!
I failed to understand why you can’t arrive at a solution for the Single-Shot game via Iterated Play without memory of the previous game. In order to clarify my ideas let me define two concepts first:
Iterated Play with memory: We repeatedly play the game knowing the results of the previous games.
Iterated Play without memory: We repeatedly play the game, while having no memory of the previous play.
The distinction is important: With memory we can at any time search all previous games and act accordingly, allowing for strategies such as Tit-for-Tat and other history dependent strategies. Without memory we can still learn ( for example by applying some sort of Bayesian updates to our probability estimates of each move being played ), whilst not having access to the previous games before each move. That way we can “learn” how to best play the single shot version of the game by iterated play.
Does what I said above need any clarification, and is there any failure in its’ logic?
Best Regards, Miguel
If you have no memory, how can you learn? I recognize that you can draw a formal distinction, allowing learning without allowing the strategies being learned to depend on the previous games. But, you are still allowing the agent itself to depend on the previous games, which means that “learning” methods wich bake in more strategy will perform better. For example, a learning method could learn to always go straight in a game of chicken by checking to see whether going straight causes the other player to learn to swerve. IE, it doesn’t seem like a principled distinction.
Furthermore, I don’t see the motivation for trying to do well in a single-shot game via iterated play. What kind of situation is it trying to model? This is discussed extensively in the paper I mentioned in the post, “If multi-agent learning is the answer, what is the question?”
The definition may not be principled, but there’s something that feels a little bit right about it in context. There are various ways to “stay in the logical past” which seem similar in spirit to migueltorrescosta’s remark, like calculating your opponent’s exact behavior but refusing to look at certain aspects of it. The proposal, it seems, is to iterate already-iterated games by passing more limited information of some sort between the possibly-infinite sessions. (Both your and the opponent’s memory gets limited.) But if we admit that Miguel’s “iterated play without memory” is iterated play, well, memory could be imperfect in varied ways at every step, giving us a huge mess instead of well-defined games and sessions. But, that mess looks more like logical time at least.
Not having read the linked paper yet, the motivation for using iterated or meta-iterated play is basically to obtain a set of counterfactuals which will be relevant during real play. Depending on the game, it makes sense that this might be best accomplished by occasionally resetting the opponent’s memory.
I have been thinking a bit about evolutionarily stable equilibria, now. Two things seem interesting (perhaps only as analogies, not literal applications of the evolutionarily stable equilibria concept):
The motivation for evolutionary equilibria involves dumb selection, rather than rational reasoning. This cuts the tricky knots of recursion. It also makes the myopic learning, which only pays attention to how well things perform in of round, seem more reasonable. Perhaps there’s something to be said about rational learning algorithms needing to cut the knots of recursion somehow, such that the evolutionary equilibrium concept holds a lesson for more reflective agents.
The idea of evolutionary stability is interesting because it mixes the game and the metagame together a little bit: the players should do what is good for them, but the resulting solution should also be self-enforcing, which means consideration is given to how the solution shapes the future dynamics of learning. This seems like a necessary feature of a solution.