I don’t understand why you want the AIs to defect against each other rather than cooperating with each other.
Are you attached to this particular failure of causal decision theory for some reason? What’s wrong with TDT agents cooperating in the Prisoner’s Dilemma and everyone living happily ever after?
I don’t understand why you want the AIs to defect against each other rather than cooperating with each other.
Come on, of course I don’t want that. I’m saying that is the inevitable outcome under the rules of the game I specified. It’s just like if I said “I don’t want two human players to defect in one-shot PD, but that is what’s going to happen.”
ETA: Also, it may help if you think of the outcome as the human players defecting against each other, with the AIs just carrying out their strategies. The human players are the real players in this game.
Are you attached to this particular failure of causal decision theory for some reason?
No, I can’t think of a reason why I would be.
What’s wrong with TDT agents cooperating in the Prisoner’s Dilemma and everyone living happily ever after?
There’s nothing wrong with that, and it may yet happen, if it turns out that the technology for proving source code can be created. But if you can’t prove that your source code is some specific string, if the only thing you have to go on is that you and the other AI must both use the same decision theory due to convergence, that isn’t enough.
Sorry if I’m repeating myself, but I’m hoping one of my explanations will get the point across...
Come on, of course I don’t want that. I’m saying that is the inevitable outcome under the rules of the game I specified. It’s just like if I said “I don’t want two human players to defect in one-shot PD, but that is what’s going to happen.”
I don’t believe that is true. It’s perfectly conceivable that two human players would cooperate.
I don’t understand why you want the AIs to defect against each other rather than cooperating with each other.
Are you attached to this particular failure of causal decision theory for some reason? What’s wrong with TDT agents cooperating in the Prisoner’s Dilemma and everyone living happily ever after?
Come on, of course I don’t want that. I’m saying that is the inevitable outcome under the rules of the game I specified. It’s just like if I said “I don’t want two human players to defect in one-shot PD, but that is what’s going to happen.”
ETA: Also, it may help if you think of the outcome as the human players defecting against each other, with the AIs just carrying out their strategies. The human players are the real players in this game.
No, I can’t think of a reason why I would be.
There’s nothing wrong with that, and it may yet happen, if it turns out that the technology for proving source code can be created. But if you can’t prove that your source code is some specific string, if the only thing you have to go on is that you and the other AI must both use the same decision theory due to convergence, that isn’t enough.
Sorry if I’m repeating myself, but I’m hoping one of my explanations will get the point across...
I don’t believe that is true. It’s perfectly conceivable that two human players would cooperate.
Yes, I see the possibility now as well, although I still don’t think it’s very likely. I wrote more about it in http://lesswrong.com/lw/15m/towards_a_new_decision_theory/11lx