The problems look like a kind of an anti-Prisoner’s Dilemma. An agent plays against an opponent, and gets a reward iff they played differently. Then any agent playing against itself is screwed.
The problems look like a kind of an anti-Prisoner’s Dilemma. An agent plays against an opponent, and gets a reward iff they played differently. Then any agent playing against itself is screwed.