If it succeeds in being the first to do so, the other side then has no choice but to accept.
This presumes that the other side obeys standard causal decision theory; in fact, it’s an illustration of why causal decision theory is vulnerable to exploitation if precommitment is available, and suggests that two selfish rational CDT agents who each have precommitment options will generally wind up sabotaging each other.
This is a reason to reject CDT as the basis for instrumental rationality, even if you’re not worried that Omega is lurking around the corner.
You can reject CDT but what are you going to replace it with? Until Eliezer publishes his decision theory and I have a chance to review it, I’m sticking with CDT.
I thought cousin_it’s result was really interesting because it seems to show that agents using standard CDT can nevertheless convert any game into a cooperative game, as long as they have some way to prove their source code to each other. My comment was made in that context, pointing out that the mechanism for proving source code needs to have a subtle property, which I termed “consensual”.
One obvious “upgrade” to any decision theory that has such problems is to discard all of your knowledge (data, observations) before making any decisions (save for some structural knowledge to leave the decision algorithm nontrivial). For each decision that you make (using given decision algorithm) while knowing X, you can make a conditional decision (using the same decision algorithm) that says “If X, then A else B”, and then recall whether X is actually true. This, for example, mends the particular failure of not being able to precommit (you remember that you are on the losing branch only after you’ve made the decision to do a certain disadvantageous action if you are on the losing branch).
You can claim that you are using such a decision theory and hence that I should find your precommitments credible, but if you have no way of proving this, then I shouldn’t believe you, since it is to your advantage to have me believe you are using such a decision theory without actually using it.
From your earlier writings I think you might be assuming that AIs would be intelligent enough to just know what decision algorithms others are using, without any explicit proof procedure. I think that’s an interesting possibility to consider, but not a very likely one. But maybe I’m missing something. If you wrote down any arguments in favor of this assumption, I’d be interested to see them.
That was an answer for your question about what should you replace CDT with. If you won’t be able to convince other agents that you now run on timeless CDT, you gain a little smaller advantage than otherwise, but that’s a separate problem. If you know that your claims of precommitment won’t be believed, you don’t precommit, it’s that easy. But sometimes, you’ll find a better solution than if you only lived in a moment.
Also note that even if you do convince other agents about the abstract fact that your decision theory is now timeless, it won’t help you very much, since it doesn’t prove that you’ll precommit in a specific situation. You only precommit in a given situation if you know that this action makes the situation better for you, which in case of cooperation means that the other side will be able to tell whether you actually precommited, and this is not at all the same as being able to tell what decision theory you use.
Since using a decision theory with precommitment is almost always an advantage, it’s easy to assume that a sufficiently intelligent agent always uses something of the sort, but that doesn’t allow you to know more about their actions—in fact, you know less, since such agent has more options now.
But sometimes, you’ll find a better solution than if you only lived in a moment.
Yes, I see that your decision theory (is it the same as Eliezer’s?) gives better solutions in the following circumstances:
dealing with Omega
dealing with copies of oneself
cooperating with a counterpart in another possible world
Do you think it gives better solutions in the case of AIs (who don’t initially think they’re copies of each other) trying to cooperate? If so, can you give a specific scenario and show how the solution is derived?
This presumes that the other side obeys standard causal decision theory; in fact, it’s an illustration of why causal decision theory is vulnerable to exploitation if precommitment is available, and suggests that two selfish rational CDT agents who each have precommitment options will generally wind up sabotaging each other.
This is a reason to reject CDT as the basis for instrumental rationality, even if you’re not worried that Omega is lurking around the corner.
You can reject CDT but what are you going to replace it with? Until Eliezer publishes his decision theory and I have a chance to review it, I’m sticking with CDT.
I thought cousin_it’s result was really interesting because it seems to show that agents using standard CDT can nevertheless convert any game into a cooperative game, as long as they have some way to prove their source code to each other. My comment was made in that context, pointing out that the mechanism for proving source code needs to have a subtle property, which I termed “consensual”.
One obvious “upgrade” to any decision theory that has such problems is to discard all of your knowledge (data, observations) before making any decisions (save for some structural knowledge to leave the decision algorithm nontrivial). For each decision that you make (using given decision algorithm) while knowing X, you can make a conditional decision (using the same decision algorithm) that says “If X, then A else B”, and then recall whether X is actually true. This, for example, mends the particular failure of not being able to precommit (you remember that you are on the losing branch only after you’ve made the decision to do a certain disadvantageous action if you are on the losing branch).
You can claim that you are using such a decision theory and hence that I should find your precommitments credible, but if you have no way of proving this, then I shouldn’t believe you, since it is to your advantage to have me believe you are using such a decision theory without actually using it.
From your earlier writings I think you might be assuming that AIs would be intelligent enough to just know what decision algorithms others are using, without any explicit proof procedure. I think that’s an interesting possibility to consider, but not a very likely one. But maybe I’m missing something. If you wrote down any arguments in favor of this assumption, I’d be interested to see them.
That was an answer for your question about what should you replace CDT with. If you won’t be able to convince other agents that you now run on timeless CDT, you gain a little smaller advantage than otherwise, but that’s a separate problem. If you know that your claims of precommitment won’t be believed, you don’t precommit, it’s that easy. But sometimes, you’ll find a better solution than if you only lived in a moment.
Also note that even if you do convince other agents about the abstract fact that your decision theory is now timeless, it won’t help you very much, since it doesn’t prove that you’ll precommit in a specific situation. You only precommit in a given situation if you know that this action makes the situation better for you, which in case of cooperation means that the other side will be able to tell whether you actually precommited, and this is not at all the same as being able to tell what decision theory you use.
Since using a decision theory with precommitment is almost always an advantage, it’s easy to assume that a sufficiently intelligent agent always uses something of the sort, but that doesn’t allow you to know more about their actions—in fact, you know less, since such agent has more options now.
Yes, I see that your decision theory (is it the same as Eliezer’s?) gives better solutions in the following circumstances:
dealing with Omega
dealing with copies of oneself
cooperating with a counterpart in another possible world
Do you think it gives better solutions in the case of AIs (who don’t initially think they’re copies of each other) trying to cooperate? If so, can you give a specific scenario and show how the solution is derived?