(First comment here. Please tell me if I do something stupid.)
So, I’ve been trying to follow along at home and figure out how to formulate a theory that would allow us to formalize and justify the intuition that we should cooperate with Clippy “if that is the only way to get Clippy to cooperate with us” (even in a non-iterated PD). I’ve run into problems with both the formalizing and the justifying part (sigh), but at least I’ve learned some lessons along the way that were not obvious to me from the posts I’ve read here so far. (How’s that for a flexible disclaimer.)
Starting with the easy part: The situation is that both players are physical processes, and, based on their previous interactions, each has some Bayesian beliefs about what physical process the other player is. This breaks the implicit decision theoretic premise that your payoff depends only on the action you choose, not on the process you use to arrive at that choice; which renders undefined the conclusion that the process you should use is to choose the action that maximizes the expected payoff (because the expected payoff may depend on the process you use); which undermines the justification for saying that you should not play a strictly dominated strategy.
However, it seems to me that we can salvage the classical notions if, instead of asking “what should the physical process do?” we ask, “what physical process should you want to be?” I.e., we create a new game, in which the strategies are possible physical decision making processes, and then we use classical decision/game theory to ask which strategy we should choose. (It seems to me that this is essentially what Eliezer has in mind.)
Formulating the problem like this helps me realize that for every strategy (physical process) X available to one player, one strategy available to the other player is, “cooperate iff the first player’s strategy is X”; call this strategy require(X). This means that even if we assume that both players are rational (in the new game, in the classic sense) and this is common knowledge, any strategy X might still be adopted: X is the best response to require(X), which is the best response to require(require(X)), which is the best response to… and so on. This is why I don’t see an easy way to justify that “cooperate iff it’s the only way to get the other player to cooperate” is the “right” thing to do (although I still hope that there is a way to justify this). (One possible angle of attack: Would the problem go away if we supposed a maximum size for the processes, so that there would be only a finite number, so that there wouldn’t be a require(X) for every X?)
The other lesson is about how to even formalize “I cooperate iff it’s the only way to get the other player to cooperate with me.” Once we have chosen our physical process, and the other side has chosen its, it’s already determined whether the other player will cooperate with us. But before we have chosen our physical process, what is the “me” that the informal description refers to?
It seems to me that the “right” way to formalize a constraint like that is as follows: 1. Initialize S, the set of processes we might choose, to the set of all possible processes. 2. Remove from S all processes that do not match the constraints, if the “me” in the constraint is any process in S. (We cooperate with Clippy only if it’s the only way to get Clippy to cooperate with us; thus, if Clippy cooperates with every process in S, then we want to defect against Clippy; thus, remove all processes from S that cooperate with unconditional cooperators.) 3. Repeat, until S converges. (“A transfinite number of times” if necessary—I don’t want to get into that here...) 4. Choose any process from S. (If S ends up empty, your constraints are contradictory.)
So far, so good; but I don’t yet even begin to see how to show that the S generated by the constraint is not empty, or how to construct a member of it.
(Argh, I’m afraid I’ve already done something stupid by allowing this comment to get so long. Sorry :-/)
In this game require(X) is not a valid strategy because you don’t have access to the strategy your opponent uses, only to the decisions you’ve seen it make. In particular, without additional assumptions we can’t assume any correlation between a player’s moves.
Hi all,
(First comment here. Please tell me if I do something stupid.)
So, I’ve been trying to follow along at home and figure out how to formulate a theory that would allow us to formalize and justify the intuition that we should cooperate with Clippy “if that is the only way to get Clippy to cooperate with us” (even in a non-iterated PD). I’ve run into problems with both the formalizing and the justifying part (sigh), but at least I’ve learned some lessons along the way that were not obvious to me from the posts I’ve read here so far. (How’s that for a flexible disclaimer.)
Starting with the easy part: The situation is that both players are physical processes, and, based on their previous interactions, each has some Bayesian beliefs about what physical process the other player is. This breaks the implicit decision theoretic premise that your payoff depends only on the action you choose, not on the process you use to arrive at that choice; which renders undefined the conclusion that the process you should use is to choose the action that maximizes the expected payoff (because the expected payoff may depend on the process you use); which undermines the justification for saying that you should not play a strictly dominated strategy.
However, it seems to me that we can salvage the classical notions if, instead of asking “what should the physical process do?” we ask, “what physical process should you want to be?” I.e., we create a new game, in which the strategies are possible physical decision making processes, and then we use classical decision/game theory to ask which strategy we should choose. (It seems to me that this is essentially what Eliezer has in mind.)
Formulating the problem like this helps me realize that for every strategy (physical process) X available to one player, one strategy available to the other player is, “cooperate iff the first player’s strategy is X”; call this strategy require(X). This means that even if we assume that both players are rational (in the new game, in the classic sense) and this is common knowledge, any strategy X might still be adopted: X is the best response to require(X), which is the best response to require(require(X)), which is the best response to… and so on. This is why I don’t see an easy way to justify that “cooperate iff it’s the only way to get the other player to cooperate” is the “right” thing to do (although I still hope that there is a way to justify this). (One possible angle of attack: Would the problem go away if we supposed a maximum size for the processes, so that there would be only a finite number, so that there wouldn’t be a require(X) for every X?)
The other lesson is about how to even formalize “I cooperate iff it’s the only way to get the other player to cooperate with me.” Once we have chosen our physical process, and the other side has chosen its, it’s already determined whether the other player will cooperate with us. But before we have chosen our physical process, what is the “me” that the informal description refers to?
It seems to me that the “right” way to formalize a constraint like that is as follows: 1. Initialize S, the set of processes we might choose, to the set of all possible processes. 2. Remove from S all processes that do not match the constraints, if the “me” in the constraint is any process in S. (We cooperate with Clippy only if it’s the only way to get Clippy to cooperate with us; thus, if Clippy cooperates with every process in S, then we want to defect against Clippy; thus, remove all processes from S that cooperate with unconditional cooperators.) 3. Repeat, until S converges. (“A transfinite number of times” if necessary—I don’t want to get into that here...) 4. Choose any process from S. (If S ends up empty, your constraints are contradictory.)
So far, so good; but I don’t yet even begin to see how to show that the S generated by the constraint is not empty, or how to construct a member of it.
(Argh, I’m afraid I’ve already done something stupid by allowing this comment to get so long. Sorry :-/)
In this game require(X) is not a valid strategy because you don’t have access to the strategy your opponent uses, only to the decisions you’ve seen it make. In particular, without additional assumptions we can’t assume any correlation between a player’s moves.