PrudentBot’s counterparty is another program intended to be legible, not a human. The point is that in practice it’s not necessary to model any humans, humans can delegate legibility to programs they submit as their representatives. It’s a popular meme that humans are incapable of performing Löbian cooperation, because they can’t model each other’s messy minds, that only AIs could make their own thinking legible to each other, granting them unique powers of coordination. This is not the case.
if it were happening in real life and not a simulated game
Programs and protocols become real life when they are given authority to enact their computations. To the extent Pareto inefficient outcomes actually happen in real life, it’s worth replacing negotiations with things like this, and fall back to BATNA when the arena says (D,D).
The point is that in practice it’s not necessary to model any humans,
Right, but my point is that it’s still necessary for something to model something. The bot arena setup in the paper has been carefully arranged so that the modelling is in the bots, the legibility is in the setup, and the decision theory comprehension is in the author’s brains.
I claim that all three of these components are necessary for robust cooperation, along with some clever system design work to make each component separable and realizable (e.g. it would be much harder to have the modelling happen in the researcher brains and the decision theory comprehension happen in the bots).
Two humans, locked in a room together, facing a true PD, without access to computers or an arena or an adjudicator, cannot necessarily robustly cooperate with each other for decision theoretic reasons, even if they both understand decision theory.
When you don’t model your human counterparty’s mind anyway, it doesn’t matter if they comprehend decision theory. The whole point of delegating to bots is that only understanding of bots by bots remains necessary after that. If your human counterparty doesn’t understand decision theory, they might submit a foolish bot, while your understanding of decision theory earns you a pile of utility.
So while the motivation for designing and setting up an arena in a particular way might be in decision theory, the use of the arena doesn’t require this understanding of the human users, and yet it can shape incentives in a way that defeats bad equilibria of classical game theory.
PrudentBot’s counterparty is another program intended to be legible, not a human. The point is that in practice it’s not necessary to model any humans, humans can delegate legibility to programs they submit as their representatives. It’s a popular meme that humans are incapable of performing Löbian cooperation, because they can’t model each other’s messy minds, that only AIs could make their own thinking legible to each other, granting them unique powers of coordination. This is not the case.
Programs and protocols become real life when they are given authority to enact their computations. To the extent Pareto inefficient outcomes actually happen in real life, it’s worth replacing negotiations with things like this, and fall back to BATNA when the arena says (D,D).
Right, but my point is that it’s still necessary for something to model something. The bot arena setup in the paper has been carefully arranged so that the modelling is in the bots, the legibility is in the setup, and the decision theory comprehension is in the author’s brains.
I claim that all three of these components are necessary for robust cooperation, along with some clever system design work to make each component separable and realizable (e.g. it would be much harder to have the modelling happen in the researcher brains and the decision theory comprehension happen in the bots).
Two humans, locked in a room together, facing a true PD, without access to computers or an arena or an adjudicator, cannot necessarily robustly cooperate with each other for decision theoretic reasons, even if they both understand decision theory.
When you don’t model your human counterparty’s mind anyway, it doesn’t matter if they comprehend decision theory. The whole point of delegating to bots is that only understanding of bots by bots remains necessary after that. If your human counterparty doesn’t understand decision theory, they might submit a foolish bot, while your understanding of decision theory earns you a pile of utility.
So while the motivation for designing and setting up an arena in a particular way might be in decision theory, the use of the arena doesn’t require this understanding of the human users, and yet it can shape incentives in a way that defeats bad equilibria of classical game theory.