The only thing I learned from that original post is that, if you use mathematically precise axioms of behavior, then you can derive weird conclusions from game theory. This part is obvious and seems rather uninteresting.
The strong claim from the original post, namely the hell scenario, comes from then back-porting the conclusions from this mathematical rigor to our intuitions about a suggested non-rigorous scenario.
But this you cannot do unless you’ve confirmed that there’s a proper correspondence from your axioms to the scenario.
For example, the scenario suggested that humans are imprisoned; but the axioms require unwilling, unchanging, inflexible robots. It suggests that your entire utility derives from your personal suffering from the temperature, so these suffering entities cannot be empathetic humans. Each robot is supposed to assume that, if it changes its strategy, none of the others does the same.
And so on. Once we’ve gone through the implications of all these axioms, we’re still left with a scenario that seems bad for those inside it. But it doesn’t deserve to be called hell, nor to appeal to our intuitions about how unpleasant this would be to personally experience. Because if humans were in those cages, that strategy profile absolutely would not be stable. If the math says otherwise, then the unrealistic axioms are at fault.
The only thing I learned from that original post is that, if you use mathematically precise axioms of behavior, then you can derive weird conclusions from game theory. This part is obvious and seems rather uninteresting.
The strong claim from the original post, namely the hell scenario, comes from then back-porting the conclusions from this mathematical rigor to our intuitions about a suggested non-rigorous scenario.
But this you cannot do unless you’ve confirmed that there’s a proper correspondence from your axioms to the scenario.
For example, the scenario suggested that humans are imprisoned; but the axioms require unwilling, unchanging, inflexible robots. It suggests that your entire utility derives from your personal suffering from the temperature, so these suffering entities cannot be empathetic humans. Each robot is supposed to assume that, if it changes its strategy, none of the others does the same.
And so on. Once we’ve gone through the implications of all these axioms, we’re still left with a scenario that seems bad for those inside it. But it doesn’t deserve to be called hell, nor to appeal to our intuitions about how unpleasant this would be to personally experience. Because if humans were in those cages, that strategy profile absolutely would not be stable. If the math says otherwise, then the unrealistic axioms are at fault.