Yes. But I don’t see this as a very grave objection, because equilibria in non-zero-sum games are typically vulnerable to manipulation by self-handicapping. A great book on this topic is “Strategy of Conflict” by Thomas Schelling, can’t recommend it enough.
Of course, if I invented a system that was resistant to manipulation, there’d be much rejoicing. But this goal seems still far away if at all possible.
Of course, if I invented a system that was resistant to manipulation, there’d be much rejoicing. But this goal seems still far away if at all possible.
It’s not hard to fix a system so that it is resistant to manipulation: given a set of players, compute how the players would want to modify their utility functions under the original system, then use the original system on the modified utility functions to pick an outcome, then make that the solution under the new system. In the new system, players no longer has any incentive to modify their utility functions.
Of course if you do this, the new system might lose some desirable properties of the original system. But since you’re assuming that the players are self-modifying AIs, that just means your system never really had those properties to begin with.
How a player would want to modify their utility function depends on how other players modify theirs. Schelling introduces “strategic moves” in games: selectively reducing your payoffs under some outcomes. (This is a formalization of unilateral commitments, e.g. burning your bridges.) A simultaneous-move game of strategic moves derived from any simple game quickly becomes a pretty tricky beast. I’ll have to look at it closer; thanks for reminding.
Yes. But I don’t see this as a very grave objection, because equilibria in non-zero-sum games are typically vulnerable to manipulation by self-handicapping. A great book on this topic is “Strategy of Conflict” by Thomas Schelling, can’t recommend it enough.
Of course, if I invented a system that was resistant to manipulation, there’d be much rejoicing. But this goal seems still far away if at all possible.
It’s not hard to fix a system so that it is resistant to manipulation: given a set of players, compute how the players would want to modify their utility functions under the original system, then use the original system on the modified utility functions to pick an outcome, then make that the solution under the new system. In the new system, players no longer has any incentive to modify their utility functions.
Of course if you do this, the new system might lose some desirable properties of the original system. But since you’re assuming that the players are self-modifying AIs, that just means your system never really had those properties to begin with.
Interesting.
How a player would want to modify their utility function depends on how other players modify theirs. Schelling introduces “strategic moves” in games: selectively reducing your payoffs under some outcomes. (This is a formalization of unilateral commitments, e.g. burning your bridges.) A simultaneous-move game of strategic moves derived from any simple game quickly becomes a pretty tricky beast. I’ll have to look at it closer; thanks for reminding.