Interesting. There’s a paradox involving a game in which players successively take a single coin from a large pile of coins. At any time a player may choose instead to take two coins, at which point the game ends and all further coins are lost. You can prove by induction that if both players are perfectly selfish, they will take two coins on their first move, no matter how large the pile is. People find this paradox impossible to swallow because they model perfect selfishness on the most selfish person they can imagine, not on a mathematically perfect selfishness machine. It’s nice to have an “intuition pump” that illustrates what genuine selfishness looks like.

Hmm. We could also put that one in terms of a human or FAI competing against a paperclip maximizer, right? The two players would successively save one human life or create one paperclip (respectively), up to some finite limit on the sum of both quantities.

If both were TDT agents (and each knows that the other is a TDT agent), then would they successfully cooperate for the most part?

In the original version of this game, is it turn-based or are both players considered to be acting simultaneously in each round? If it is simultaneous, then it seems to me that the paperclip-maximizing TDT and the human[e] TDT would just create one paperclip at a time and save one life at a time until the “pile” is exhausted. Not quite sure about what would happen if the game is turn-based, but if the pile is even, I’d expect about the same thing to happen, and if the pile is odd, they’d probably be able to successfully coordinate (without necessarily communicating), maybe by flipping a coin when two pile-units remain and then acting in such a way to ensure that the expected distribution is equal.

Interesting. There’s a paradox involving a game in which players successively take a single coin from a large pile of coins. At any time a player may choose instead to take two coins, at which point the game ends and all further coins are lost. You can prove by induction that if both players are perfectly selfish, they will take two coins on their first move, no matter how large the pile is. People find this paradox impossible to swallow because they model perfect selfishness on the most selfish person they can imagine, not on a mathematically perfect selfishness machine. It’s nice to have an “intuition pump” that illustrates what

genuineselfishness looks like.Hmm. We could also put that one in terms of a human or FAI competing against a paperclip maximizer, right? The two players would successively save one human life or create one paperclip (respectively), up to some finite limit on the sum of both quantities.

If both were TDT agents (and each knows that the other is a TDT agent), then would they successfully cooperate for the most part?

In the original version of this game, is it turn-based or are both players considered to be acting simultaneously in each round? If it is simultaneous, then it seems to me that the paperclip-maximizing TDT and the human[e] TDT would just create one paperclip at a time and save one life at a time until the “pile” is exhausted. Not quite sure about what would happen if the game is turn-based, but if the pile is even, I’d expect about the same thing to happen, and if the pile is odd, they’d probably be able to successfully coordinate (without necessarily communicating), maybe by flipping a coin when two pile-units remain and then acting in such a way to ensure that the expected distribution is equal.