All that matters is that the way you make decisions about prisoner’s dilemmas is identical; all other characteristics of the players can differ. And that’s exactly what the premise of “both players are rational” does; it ensures that their decision-making processes are replicas of each other.
That’s not quite all that matters; it also has to be common knowledge among the players that they are similar enough to make their decisions for the same reasons. So it’s not enough to just say “I’m a functional decision theorist, therefore I will cooperate with anyone else who claims to be a FDT”, because then someone can just come along, claim to be a FDT to trick you into cooperating unconditionally, and then defect in order to get more utility for themselves.
Alternatively, your opponent might be the one confused about whether they are a FDT agent or not, and will just play cooperate unconditionally as long as you tell them a convincing-enough lie. Deontological issues aside, if you can convince your opponent to cooperate regardless of your own decision, then any decision theory (or just common sense) will tell you that you should play defect.
The point is, you might think you’re a FDT agent, or want to be one, but unless you’re actually sufficiently good at modeling your counterparty in detail (including modeling their model of you), and you are also reasonably confident that your counterparty possesses the same skills, such that your decisions really do depend on their private mental state (and vice versa), one of you might actually be closer to a rock with “cooperate” written on it, for which actual FDT agents (or anyone clever enough to see that) will correctly defect against you.
Among humans, making your own decision process legible enough and correlated enough with your opponent’s decision process is probably hard or at least non-trivial; if you’re an AI that can modify and exhibit your own source code, it is probably a bit easier to reach mutual cooperation, at least with other AIs. I wrote a bit more about this here.
I agree with everything you say there. Is this intended as disagreement with a specific claim I made? I’m just a little confused what you’re trying to convey.
If you agree with everything he said, then you don’t think rational agents cooperate on this dilemma in any plausible real-world scenario, right? Even superintelligent agents aren’t going to have full and certain knowledge of each other.
No? Like I explained in the post, cooperation doesn’t require certainty, just that the expected value of cooperation is higher than that of defection. With the standard payoffs, rational agents cooperate as long as they assign greater than 75% credence to the other player making the same decision as they do.
Not really disagreeing with anything specific, just pointing out what I think is a common failure mode where people first learn of better decision theories than CDT, say, “aha, now I’ll cooperate in the prisoner’s dilemma!” and then get defected on. There’s still some additional cognitive work required to actually implement a decision theory yourself, which is distinct from both understanding that decision theory, and wanting to implement it. Not claiming you yourself don’t already understand all this, but I think it’s important as a disclaimer in any piece intended to introduce people previously unfamiliar with decision theories.
Ah, I see. That’s what I was trying to get at with the probabilistic case of “you should still cooperate as long as there’s at least a 75% chance the other person reasons the same way you do”, and the real-world examples at the end, but I’ll try to make that more explicit. Becoming cooperate-bot is definitely not rational!
That’s not quite all that matters; it also has to be common knowledge among the players that they are similar enough to make their decisions for the same reasons. So it’s not enough to just say “I’m a functional decision theorist, therefore I will cooperate with anyone else who claims to be a FDT”, because then someone can just come along, claim to be a FDT to trick you into cooperating unconditionally, and then defect in order to get more utility for themselves.
Alternatively, your opponent might be the one confused about whether they are a FDT agent or not, and will just play cooperate unconditionally as long as you tell them a convincing-enough lie. Deontological issues aside, if you can convince your opponent to cooperate regardless of your own decision, then any decision theory (or just common sense) will tell you that you should play defect.
The point is, you might think you’re a FDT agent, or want to be one, but unless you’re actually sufficiently good at modeling your counterparty in detail (including modeling their model of you), and you are also reasonably confident that your counterparty possesses the same skills, such that your decisions really do depend on their private mental state (and vice versa), one of you might actually be closer to a rock with “cooperate” written on it, for which actual FDT agents (or anyone clever enough to see that) will correctly defect against you.
Among humans, making your own decision process legible enough and correlated enough with your opponent’s decision process is probably hard or at least non-trivial; if you’re an AI that can modify and exhibit your own source code, it is probably a bit easier to reach mutual cooperation, at least with other AIs. I wrote a bit more about this here.
I agree with everything you say there. Is this intended as disagreement with a specific claim I made? I’m just a little confused what you’re trying to convey.
If you agree with everything he said, then you don’t think rational agents cooperate on this dilemma in any plausible real-world scenario, right? Even superintelligent agents aren’t going to have full and certain knowledge of each other.
No? Like I explained in the post, cooperation doesn’t require certainty, just that the expected value of cooperation is higher than that of defection. With the standard payoffs, rational agents cooperate as long as they assign greater than 75% credence to the other player making the same decision as they do.
Not really disagreeing with anything specific, just pointing out what I think is a common failure mode where people first learn of better decision theories than CDT, say, “aha, now I’ll cooperate in the prisoner’s dilemma!” and then get defected on. There’s still some additional cognitive work required to actually implement a decision theory yourself, which is distinct from both understanding that decision theory, and wanting to implement it. Not claiming you yourself don’t already understand all this, but I think it’s important as a disclaimer in any piece intended to introduce people previously unfamiliar with decision theories.
Ah, I see. That’s what I was trying to get at with the probabilistic case of “you should still cooperate as long as there’s at least a 75% chance the other person reasons the same way you do”, and the real-world examples at the end, but I’ll try to make that more explicit. Becoming cooperate-bot is definitely not rational!