Edit: Can you explain how you decided on the parts of the payoff matrix involving “other”? These seem quite important as they affect the viability of strategies based on either convincing your opponent not to halt or not halting yourself.
The payoffs for “other” were designed so that neither failing to halt, nor convincing the other player not to halt, should ever be a worthwhile strategy. If you don’t halt, it gives you the same payoff as if you had cooperated, and gives the other player the same payoff as if you had defected. That way, not halting should be strictly dominated by defecting, since you are better off if you defect, and the other player should react the same way to each threat. And tricking the other player into not halting is also a bad idea, since the payoff you get from it is the same as if they defected.
The payoffs for “other” were designed so that neither failing to halt, nor convincing the other player not to halt, should ever be a worthwhile strategy. If you don’t halt, it gives you the same payoff as if you had cooperated, and gives the other player the same payoff as if you had defected.
Your game world implements an “enthusiastic consent” policy
True, but I still don’t expect it will be a big problem. If there are a lot of submissions, the effect will be small, and if it is paying enough attention to your source code for it to be possible to trick it into not halting, then it is probably looking for a way to achieve mutual cooperation, so tricking it is still not a good strategy.
If you trust all sufficiently smart players to try to induce (Defect, non-halt) if (Defect, Defect) is otherwise inevitable, the effect adds up over a hopefully significant portion of the submissions.
Make sure the equality comparison only depends on things that affect functionality—i.e. it will declare any functionally equivalent programs equal even they use a different nameset for variables or something.
(Yes, I know that’s reducible to the halting problem; in practice, you’ll use a computable, polynomial time approximation for it that will inevitably have to throw out equivalent programs that are too complex or otherwise be too ‘clever’.)
It’s quite likely that the optimal behaviour should be different in case the program is functionally equal but not exactly equal.
If you’re playing yourself, then you want to cooperate.
If you’re playing someone else, then you’d want to cooperate if and only if that someone else is smart enough to check if you’ll cooperate; but if it’s decision doesn’t depend on yours, then you should defect.
You only need to evaluate the equivalence of the first two lines of the program, by the way. It cooperates with those who can’t not cooperate with it if it goes through that branch of the logic, and does something else to everyone else.
Injecting some keywords: this field of study is known as program equilibrium. Previous LW post on the subject, with links.
Edit: Can you explain how you decided on the parts of the payoff matrix involving “other”? These seem quite important as they affect the viability of strategies based on either convincing your opponent not to halt or not halting yourself.
The payoffs for “other” were designed so that neither failing to halt, nor convincing the other player not to halt, should ever be a worthwhile strategy. If you don’t halt, it gives you the same payoff as if you had cooperated, and gives the other player the same payoff as if you had defected. That way, not halting should be strictly dominated by defecting, since you are better off if you defect, and the other player should react the same way to each threat. And tricking the other player into not halting is also a bad idea, since the payoff you get from it is the same as if they defected.
Your game world implements an “enthusiastic consent” policy
(Defect, non-halt) is actually better than (Defect, Defect) for you, since it gives you a relative advantage over competitors in the tournament.
True, but I still don’t expect it will be a big problem. If there are a lot of submissions, the effect will be small, and if it is paying enough attention to your source code for it to be possible to trick it into not halting, then it is probably looking for a way to achieve mutual cooperation, so tricking it is still not a good strategy.
If you trust all sufficiently smart players to try to induce (Defect, non-halt) if (Defect, Defect) is otherwise inevitable, the effect adds up over a hopefully significant portion of the submissions.
Handy introductory article: Computation and the Prisoner’s Dilemma.
TIL that ethnic hatred and tribalism is a Nash (folk) equilibrium.
Make sure the equality comparison only depends on things that affect functionality—i.e. it will declare any functionally equivalent programs equal even they use a different nameset for variables or something.
(Yes, I know that’s reducible to the halting problem; in practice, you’ll use a computable, polynomial time approximation for it that will inevitably have to throw out equivalent programs that are too complex or otherwise be too ‘clever’.)
Patrick discusses this issue here in some depth.
It’s quite likely that the optimal behaviour should be different in case the program is functionally equal but not exactly equal.
If you’re playing yourself, then you want to cooperate.
If you’re playing someone else, then you’d want to cooperate if and only if that someone else is smart enough to check if you’ll cooperate; but if it’s decision doesn’t depend on yours, then you should defect.
You only need to evaluate the equivalence of the first two lines of the program, by the way. It cooperates with those who can’t not cooperate with it if it goes through that branch of the logic, and does something else to everyone else.
Can you write that in Scheme so I can submit this? Thanks
So as not to duplicate efforts: I have emailed Moshe Tennenholtz and Michael Wooldridge with invitations to play.