problem is twofold: somehow assign everyone a strategy so that the overall outcome is “good and fair”, then somehow force everyone to play the assigned strategies.
That’s not how I see the PD at all; each agent is only interested in maximizing individually, but they have common knowledge of each other’s source code and hence realize that their actions will be in some sense correlated; informally, Hofstadterian superrationality is a fine way of looking at it. The problem is how this extends to asymmetrical problems in which the Nash equilibrium is not a Pareto optimum.
There is no global “good and fair” pure external Archimedean viewpoint. Just a flock of agents only trying to maximize their own utilities, who are all using the same base decision algorithm, and who all know it, and who understand the correlation this implies.
There is no global “good and fair” pure external Archimedean viewpoint. Just a flock of agents only trying to maximize their own utilities, who are all using the same base decision algorithm, and who all know it, and who understand the correlation this implies.
My understanding is that cousin_it is suggesting just such a base decision algorithm, which works like this:
compute a global outcome that is “good and fair” for everyone who is using this decision algorithm
choose the option that implements the above outcome
(anyone who doesn’t follow this algorithm will be left out of the “good and fair” computation and presumably will fail to maximize its utility as a result)
We can all probably see a whole bunch of difficulties here, both technical and philosophical. But Eliezer, it’s not clear from reading your comment what your objection is exactly.
ETA: I just noticed that cousin it’s proposed “good and fair” formula doesn’t actually ensure my point above in parenthesis (that anyone who doesn’t follow the decision algorithm will fail to maximize its utility). To see this, suppose in PD one of the players can choose a third option, which is not Pareto-optimal but unilaterally gives it a higher payoff than assigned by cousin_it’s formula.
cousin_it, if you’re reading this, please see http://lesswrong.com/lw/102/indexical_uncertainty_and_the_axiom_of/sk8 , where Vladimir Nesov proposed a notion that turns out to coincide with the concept of the core in cooperative game theory. This is necessary to ensure that the “good and fair” solution will be self-enforcing.
If one of the PD players has a third option of “get two bucks guaranteed and screw everyone else”—if the game structure doesn’t allow other players to punish him—then no algorithm at all can punish him. Or did you mean something else?
Yep, I know what the core is, and it does seem relevant. But seeing as my solution is definitely wrong for stability reasons, I’m currently trying to think of any stable solution (continuous under small changes in game payoffs), and failing so far. Will think about the core later.
If one of the PD players has a third option of “get two bucks guaranteed and screw everyone else”—if the game structure doesn’t allow other players to punish him—then no algorithm at all can punish him. Or did you mean something else?
The “good and fair” solution needs to offer him a Pareto improvement over the outcome that he can reach by himself.
Wei Dai, thanks. I gave some thought to your comments and they seem to constitute a proof that any “purely geometric” construction (that depends only on the Pareto set) fails your criterion. Amending the post.
That’s not how I see the PD at all; each agent is only interested in maximizing individually, but they have common knowledge of each other’s source code and hence realize that their actions will be in some sense correlated; informally, Hofstadterian superrationality is a fine way of looking at it. The problem is how this extends to asymmetrical problems in which the Nash equilibrium is not a Pareto optimum.
There is no global “good and fair” pure external Archimedean viewpoint. Just a flock of agents only trying to maximize their own utilities, who are all using the same base decision algorithm, and who all know it, and who understand the correlation this implies.
Think bootstraps, not skyhooks.
My understanding is that cousin_it is suggesting just such a base decision algorithm, which works like this:
compute a global outcome that is “good and fair” for everyone who is using this decision algorithm
choose the option that implements the above outcome
(anyone who doesn’t follow this algorithm will be left out of the “good and fair” computation and presumably will fail to maximize its utility as a result)
We can all probably see a whole bunch of difficulties here, both technical and philosophical. But Eliezer, it’s not clear from reading your comment what your objection is exactly.
ETA: I just noticed that cousin it’s proposed “good and fair” formula doesn’t actually ensure my point above in parenthesis (that anyone who doesn’t follow the decision algorithm will fail to maximize its utility). To see this, suppose in PD one of the players can choose a third option, which is not Pareto-optimal but unilaterally gives it a higher payoff than assigned by cousin_it’s formula.
cousin_it, if you’re reading this, please see http://lesswrong.com/lw/102/indexical_uncertainty_and_the_axiom_of/sk8 , where Vladimir Nesov proposed a notion that turns out to coincide with the concept of the core in cooperative game theory. This is necessary to ensure that the “good and fair” solution will be self-enforcing.
If one of the PD players has a third option of “get two bucks guaranteed and screw everyone else”—if the game structure doesn’t allow other players to punish him—then no algorithm at all can punish him. Or did you mean something else?
Yep, I know what the core is, and it does seem relevant. But seeing as my solution is definitely wrong for stability reasons, I’m currently trying to think of any stable solution (continuous under small changes in game payoffs), and failing so far. Will think about the core later.
The “good and fair” solution needs to offer him a Pareto improvement over the outcome that he can reach by himself.
Wei Dai, thanks. I gave some thought to your comments and they seem to constitute a proof that any “purely geometric” construction (that depends only on the Pareto set) fails your criterion. Amending the post.
Sorry, I was being stupid. You’re right.