electroswing comments on Explaining “Hell is Game Theory Folk Theorems”

electroswing 6 May 2023 13:30 UTC
3 points
0
I’m not sure what the author intended, but my best guess is they wanted to say “punishment is bad because there exist really bad equilibria which use punishment, by folk theorems”. Some evidence from the post (emphasis mine):
Rowan: “If we succeed in making aligned AGI, we should punish those who committed cosmic crimes that decreased the chance of an positive singularity sufficiently.”
Neal: “Punishment seems like a bad idea. It’s pessimizing another agent’s utility function. You could get a pretty bad equilibrium if you’re saying agents should be intentionally harming each others’ interests, even in restricted cases.”

[...]

Rowan: “Well, I’ll ponder this. You may have convinced me of the futility of punishment, and the desirability of mercy, with your… hell simulation. That’s… wholesome in its own way, even if it’s horrifying, and ethically questionable.”
Folk theorems guarantee the existence of equilibria for both good (31) and bad (99) payoffs for players, both via punishment. For this reason I view them as neutral: they say lots of equilibria exist, but not which ones are going to happen.
I guess if you are super concerned about bad equilibria, then you could take a stance against punishment, because then it would be harder/impossible for the everyone-plays-99 equilibrium to form. This could have been the original point of the post but I am not sure.
- Seth Herd 6 May 2023 18:01 UTC
  3 points
  1
  Parent
  That’s right, I think that was the original point. But this example seems to be a bad one for making that point, because it’s punishing pro-social behavior. If you could show how punishing antisocial, defecting behavior had bad consequences, that would be interesting.
  - Martin Randall 6 May 2023 23:40 UTC
    3 points
    0
    Parent
    The solution for infinite iterated prisoners dilemma given in the opening post is an example of potential bad consequences for punishing anti-social defecting behavior. A single defection at any point in the infinitely long game causes both players to get the worst outcome in the limit.
    
    If both players are perfectly rational and error-free then this is not fatal. However, a better strategy with reduced punishment and some mercy gets better outcomes in scenarios where those assumptions don’t hold.
    - Seth Herd 7 May 2023 0:10 UTC
      3 points
      0
      Parent
      I agree, that’s a way better example because that type of punishment sounds like a potentially good strategy on the face of it.