Seems simple enough to me, too, as my answer yesterday implied. The probability the Earth is that young is close enough to 0 that it doesn’t factor into my utility calculations, so Omega is asking me if I want to save a billion people. Do whatever you have to do to convince him, then save a billion people.
With this attitude, you won’t be able to convince him. He’ll expect you to defect, no matter what you say. It’s obvious to you what you’ll do, and it’s obvious for him. By refusing to save a billion people, and instead choosing the meaningless alternative option, you perform an instrumental action that results in your opponent saving 2 billion people. You control the other player indirectly.
Choosing the option other than saving 1 billion people doesn’t have any terminal value, but it does have instrumental value, more of it than there is in directly saving 1 billion people.
This is not to say that you can place this kind of trust easily, for humans you may indeed require making a tangible precommitment. Humans are by default broken, in some situations you don’t expect the right actions from them, the way you don’t expect the right actions from rocks. An external precommitment is a crutch that compensates for the inborn ailments.
What makes us assume this? I get why in examples where you can see each others’ source code this can be the case, and I do one-box on Newcomb where a similar situation is given, but I don’t see how we can presume that there is this kind of instrumental value. All we know about this person is he is a flat earther, and I don’t see how this corresponds to such efficient lie detection in both directions for both of us.
Obviously if we had a tangible precommitment option that was sufficient when a billion lives were at stake, I would take it. And I agree that if the payoffs were 1 person vs. 2 billion people on both sides, this would be a risk I’d be willing to take. But I don’t see how we can suppose that the correspondance between “he thinks I will choose C if he agrees to choose C, and in fact then chooses C” and “I actually intend to choose C if he agrees to choose C” is not all that high. If the flat Earther in question is the person on whom they based Dr. Cal Lightman I still don’t choose C because I’d feel that even if he believed me he’d probably choose D anyway. Do you think mosthumans are this good at lie detection (I know that I am not), and if so do you have evidence for it?
I get why in examples where you can see each others’ source code this can be the case, and I do one-box on Newcomb where a similar situation is given, but I don’t see how we can presume that there is this kind of instrumental value. All we know about this person is he is a flat earther, and I don’t see how this corresponds to such efficient lie detection in both directions for both of us.
What does the source code really impart? Certainty in the other process’ workings. But why would you need certainty? Is being a co-operator really so extraordinary a claim that to support it you need overwhelming evidence that leaves no other possibilities?
The problem is that there are three salient possibilities for what the other player is:
Defector, who really will defect, and will give you evidence of being a defector
Co-operator, who will really cooperate (with another who he believes to be a co-operator), and will give you evidence of being a co-operator
Deceiver, who will really defect, but will contrive evidence that he is a co-operator
Between co-operator and deceiver, all else equal, you should expect the evidence given by co-operator to be stronger than evidence given by deceiver. Deceiver has to support a complex edifice of his lies, separate from reality, while co-operator can rely on the whole of reality for support of his claims. As a result, each argument a co-operator makes should on average bring you closer to believing that he really is a co-operator, as opposed to being a deceiver. This process may be too slow to shift your expectation from the prior of very strongly disbelieving in existence of co-operators to posterior of believing that this one is really a co-operator, and this may be a problem. But this problem is only as dire as the rarity of co-operators and the deceptive eloquence of deceivers.
We clearly disagree strongly on the probabilities here. I agree that all things being equal you have a better shot at convincing him than I do, but I think it is small. We both do the same thing in the Defector case. In the co-operator course, he believes you with probability P+Q and me with probability P. Assuming you know if he trusts you in this case (we count anything else as deceivers) you save (P+Q) 2 +(1-P-Q) 1, I save (P) 3+(1-P) 1, both times the percentage of co-operators R. So you have to be at least twice as successful as I am even if there are no deceivers on the other side. Meanwhile, there’s some percentage A who are decievers and some probability B that you’ll believe a deceiver, or just A and 1 if you count anyone you don’t believe as a simple Defector.
You think that R (P+Q) 2 + R (1-P-Q) 1 > R P 3 + R (1-P) 1 + A B 1. I strongly disagree. But if you convinced me otherwise, I would change my opinion.
In the co-operator course, he believes you with probability P+Q and me with probability P.
That may be for one step, but my point is that the truth ultimately should win over lies. If you proceed to the next point of argument, you expect to distinguish Cooperator from Defector a little bit better, and as the argument continues, your ability to distinguish the possibilities should improve more and more.
The problem may be that it’s not a fast enough process, but not that there is some fundamental limitation on how good the evidence may get. If you study the question thoroughly, you should be able to move long way away from uncertainty in the direction of truth.
By refusing to save a billion people, and instead choosing the meaningless alternative option, you perform an instrumental action that results in your opponent saving 2 billion people.
How does it to that, please? How does my action affect his?
Maybe it’s not enough, maybe you need to do more than just doing the right thing. But it you actually plan to defect, you have no hope of convincing the other player that you won’t. (See the revised last paragraph of the above comment.)
Yes, if we can both pre-commit in a binding way, that’s great. But what if we can’t?
I feel that this is related to the intuitions on free will. When a stone is thrown your way, you can’t change what you’ll do, you’ll either duck, or you won’t. If you duck, it means that you are a stone-avoider, a system that has a property of avoiding stones, that processes data indicating the fact that a stone is flying your way, and transforms it into the actions of impact-avoiding.
The precommitment is only useful because [you+precommitment] is a system with a known characteristic of co-operator, that performs cooperation in return to the other co-operators. What you need in order to arrange mutual cooperation is to signal the other player that you are a co-operator, and to make sure that the other player is also a co-operator. Signaling the fact that you are a co-operator is easy if you attach a precommitment crutch to your natural decision-making algorithm.
Since co-operators win more than mutual defectors, being a co-operator is rational, and so it’s often just said that if you and your opponent are rational, you’ll cooperate.
There is a stigma of being just human, but I guess some kind of co-operator certification or a global meta-commitment of reflective consistency could be arranged to both signal that you are now a co-operator and enforce actually making co-operative decisions.
Instead of answering AllanCrossman’s question, you have provided a stellar example of how scholastics turns brains to mush. Read this.
Update 2: maybe, to demonstrate my point, I should quote some hilarious examples of faulty thinking from the article I linked to. Here we go:
19 Three is not an object at all, but an essence; not a thing, but a thought; not a particular, but a universal.
28 The number three is neither an idle Platonic universal, nor a blank Lockean substratum; it is a concrete and specific energy in things, and can be detected at work in such observable processes as combustion.
32 Since the properties of three are intelligible, and intelligibles can exist only in the intellect, the properties of three exist only in the intellect.
35 We get the concept of three only through the transcendental unity of our intuitions as being successive in time.
If you think Vladimir is being opaque with his writing, and you disagree with his conclusion, that is not the same as asserting that he’s writing nonsense. Charity (and the evidence of his usual clarity) demand that you ask for clarification before accusing him of such.
Instead of answering AllanCrossman’s question, you have provided a stellar example of how scholastics turns brains to mush.
Actually, I thought that I made a relatively clear argument, and I’m surprised that it’s not upvoted (the same goes for the follow-up here). Maybe someone could constructively comment on why that is. I expect that the argument is not easy to understand, and maybe I failed at seeing the inferential distance between my argument and intended audience, so that people who understood the argument already consider it too obvious to be of notice, and people who disagree with the conclusion didn’t understand the argument… Anyway, any constructive feedback on meta level would be appreciated.
On the concept of avoiders, see Dennett’s lecture here. Maybe someone can give a reference in textual form.
You answered: it’s good to be able to precommit, maybe we can still arrange it somehow.
Thus simplified, it doesn’t look like an answer. But you didn’t say it in simple words. You added philosophical fog that, when parsed and executed, completely cancels out, giving us no indication how to actually precommit.
My reply can be summarized as explaining why “precommiting in binding way” is not a clear-cut necessity for this problem. If you are a cooperator, there is no need to precommit.
In your terms, being a cooperator for this specific problem is synonymous to precommitting. You’re just shunting words around. All right, how do I actually be a cooperator?
No, it’s not synonymous. If you precommit, you become a cooperator, but you can also be one without precommiting. If you are an AI that is written to be a cooperator, you’ll be one. If you decide to act as a cooperator, you may be one. Being a cooperator is relatively easy. Being a cooperator and successfully signating that you are one, without precommitment, is in practice much harder. And a related problem, if you are a cooperator, you have to recognize a signal that the other person is a cooperator also, which may be too hard if he hasn’t precommited.
What? The implication goes both ways. If you’re a cooperator (in your terms), then you’re precommitted to cooperating (in classical terms). Maybe you misunderstand the word “precommitment”? It doesn’t necessarily imply that some natural power forces the other guy to believe you.
If you define precommitment this way, then every property becomes a precommitment to having that property, and the concept of precommitment becomes tautological. For example, is it a precommitment to always prefer good over evil (defined however you like)?
What’s “mutable”? Changing in time? Cooperation may be a one-off encounter, with no multiple occasions to change over. You may be a cooperator for the duration of one encounter, and a rock elsewhere. Every fact is immutable, so I don’t know what you imply here.
Precommitment is an interaction between two different times: the time when you’re doing cheap talk with the opponent, and the time when you’re actually deciding in the closed room. The time you burn your ships, and the time your troops go to battle. Signaling time and play time. If a property is immutable (preferably physically immutable) between those two times, that’s precommitment. Sounds synonymous to your “being a cooperator” concept.
In other words, my point is that if the signaling is about your future property, at the moment when you have to perform the promised behavior, there is no need for any kind of persistence, thus according to your definition precommitment is unnecessary. Likewise, signaling doesn’t need to consist in you presenting any kind of argument, it may already be known that you (will be) a cooperator.
For example, the agent in question may be selected from a register of cooperators, where 99% of them are known to be cooperators. And cooperators themselves might as well be human, who decided to follow this counterintuitive algorithm, and benefit from doing so when interacting with other known cooperators, without any tangible precommitment system in place, no punishment for not being cooperators. This example may be implemented through reputation system.
No such thing as future property. This isn’t a factual disagreement on my part, just a quibble over terms; disregard it.
Your example isn’t about signaling or precommitment, it’s changing the game into multiple-shot, modifying the agent’s utility function in an isolated play to take into account their reputation for future plays. Yes, it works. But doesn’t help much in true one-shot (or last-play) situations.
On the other hand, the ideal platonic PD is also quite rare in reality—not as rare as Newcomb’s, but still. You may remember us having an isomorphic argument about Newcomb’s some time ago, with roles reversed—you defending the ideal platonic Newcomb’s Problem, and me questioning its assumptions :-)
Me, I don’t feel moral problems defecting in the pure one-shot PD. Some situations are just bad to be in, and the best way out is bad too. Especially situations where something terribly important to you is controlled by a cold uncaring alien entity, and the problem has been carefully constructed to prohibit you from manipulating it (Eliezer’s “true PD”).
No such thing as future property. This isn’t a factual disagreement on my part, just a quibble over terms; disregard it.
In what sense do you mean no such thing? Clearly, there are future properties. My cat has a property of being dead in the future.
Your example isn’t about signaling or precommitment, it’s changing the game into multiple-shot, modifying the agent’s utility function in an isolated play to take into account their reputation for future plays. Yes, it works. But doesn’t help much in true one-shot (or last-play) situations.
Yes, it was just an example of how to set up cooperation without precommitment. It’s clear that signaling being a one-off cooperator is a very hard problem, if you are only human and there are no Omegas flying around.
This doesn’t place the future in a privileged position. Even though I’m certain I saw my cat 10 minutes ago, it wasn’t alive a week ago with probability one, either.
My answer to this would be that people have dispositions to behavior, and these dispositions color everything we do. If one might profit by showing courage, a coward will not do as well as a courageous man.
Of course, the relative success of such people at faking in appropriate situations is perhaps an empirical question.
ETA: this makes less sense as a direct response since you edited your comment. However, I think the difference is that “being a cooperator” regards a disposition that is part of the sort of person you are (though I think the above comment uses it more narrowly as a disposition that might only affect this one action), while a precommitment… well, I’m not sure actual people really do have those, if they’re immutable.
You need to make it clear how my intention to defect or my intention to cooperate influences the other guy’s actions, even if what I say to him is identical in both cases. Assume I’m a good liar.
Seems simple enough to me, too, as my answer yesterday implied. The probability the Earth is that young is close enough to 0 that it doesn’t factor into my utility calculations, so Omega is asking me if I want to save a billion people. Do whatever you have to do to convince him, then save a billion people.
With this attitude, you won’t be able to convince him. He’ll expect you to defect, no matter what you say. It’s obvious to you what you’ll do, and it’s obvious for him. By refusing to save a billion people, and instead choosing the meaningless alternative option, you perform an instrumental action that results in your opponent saving 2 billion people. You control the other player indirectly.
Choosing the option other than saving 1 billion people doesn’t have any terminal value, but it does have instrumental value, more of it than there is in directly saving 1 billion people.
This is not to say that you can place this kind of trust easily, for humans you may indeed require making a tangible precommitment. Humans are by default broken, in some situations you don’t expect the right actions from them, the way you don’t expect the right actions from rocks. An external precommitment is a crutch that compensates for the inborn ailments.
What makes us assume this? I get why in examples where you can see each others’ source code this can be the case, and I do one-box on Newcomb where a similar situation is given, but I don’t see how we can presume that there is this kind of instrumental value. All we know about this person is he is a flat earther, and I don’t see how this corresponds to such efficient lie detection in both directions for both of us.
Obviously if we had a tangible precommitment option that was sufficient when a billion lives were at stake, I would take it. And I agree that if the payoffs were 1 person vs. 2 billion people on both sides, this would be a risk I’d be willing to take. But I don’t see how we can suppose that the correspondance between “he thinks I will choose C if he agrees to choose C, and in fact then chooses C” and “I actually intend to choose C if he agrees to choose C” is not all that high. If the flat Earther in question is the person on whom they based Dr. Cal Lightman I still don’t choose C because I’d feel that even if he believed me he’d probably choose D anyway. Do you think mosthumans are this good at lie detection (I know that I am not), and if so do you have evidence for it?
What does the source code really impart? Certainty in the other process’ workings. But why would you need certainty? Is being a co-operator really so extraordinary a claim that to support it you need overwhelming evidence that leaves no other possibilities?
The problem is that there are three salient possibilities for what the other player is:
Defector, who really will defect, and will give you evidence of being a defector
Co-operator, who will really cooperate (with another who he believes to be a co-operator), and will give you evidence of being a co-operator
Deceiver, who will really defect, but will contrive evidence that he is a co-operator
Between co-operator and deceiver, all else equal, you should expect the evidence given by co-operator to be stronger than evidence given by deceiver. Deceiver has to support a complex edifice of his lies, separate from reality, while co-operator can rely on the whole of reality for support of his claims. As a result, each argument a co-operator makes should on average bring you closer to believing that he really is a co-operator, as opposed to being a deceiver. This process may be too slow to shift your expectation from the prior of very strongly disbelieving in existence of co-operators to posterior of believing that this one is really a co-operator, and this may be a problem. But this problem is only as dire as the rarity of co-operators and the deceptive eloquence of deceivers.
We clearly disagree strongly on the probabilities here. I agree that all things being equal you have a better shot at convincing him than I do, but I think it is small. We both do the same thing in the Defector case. In the co-operator course, he believes you with probability P+Q and me with probability P. Assuming you know if he trusts you in this case (we count anything else as deceivers) you save (P+Q) 2 +(1-P-Q) 1, I save (P) 3+(1-P) 1, both times the percentage of co-operators R. So you have to be at least twice as successful as I am even if there are no deceivers on the other side. Meanwhile, there’s some percentage A who are decievers and some probability B that you’ll believe a deceiver, or just A and 1 if you count anyone you don’t believe as a simple Defector.
You think that R (P+Q) 2 + R (1-P-Q) 1 > R P 3 + R (1-P) 1 + A B 1. I strongly disagree. But if you convinced me otherwise, I would change my opinion.
Here’s an older thread about this
That may be for one step, but my point is that the truth ultimately should win over lies. If you proceed to the next point of argument, you expect to distinguish Cooperator from Defector a little bit better, and as the argument continues, your ability to distinguish the possibilities should improve more and more.
The problem may be that it’s not a fast enough process, but not that there is some fundamental limitation on how good the evidence may get. If you study the question thoroughly, you should be able to move long way away from uncertainty in the direction of truth.
How does it to that, please? How does my action affect his?
Maybe it’s not enough, maybe you need to do more than just doing the right thing. But it you actually plan to defect, you have no hope of convincing the other player that you won’t. (See the revised last paragraph of the above comment.)
Why? My opponent is not a mind-reader.
Yes, if we can both pre-commit in a binding way, that’s great. But what if we can’t?
I feel that this is related to the intuitions on free will. When a stone is thrown your way, you can’t change what you’ll do, you’ll either duck, or you won’t. If you duck, it means that you are a stone-avoider, a system that has a property of avoiding stones, that processes data indicating the fact that a stone is flying your way, and transforms it into the actions of impact-avoiding.
The precommitment is only useful because [you+precommitment] is a system with a known characteristic of co-operator, that performs cooperation in return to the other co-operators. What you need in order to arrange mutual cooperation is to signal the other player that you are a co-operator, and to make sure that the other player is also a co-operator. Signaling the fact that you are a co-operator is easy if you attach a precommitment crutch to your natural decision-making algorithm.
Since co-operators win more than mutual defectors, being a co-operator is rational, and so it’s often just said that if you and your opponent are rational, you’ll cooperate.
There is a stigma of being just human, but I guess some kind of co-operator certification or a global meta-commitment of reflective consistency could be arranged to both signal that you are now a co-operator and enforce actually making co-operative decisions.
Instead of answering AllanCrossman’s question, you have provided a stellar example of how scholastics turns brains to mush. Read this.
Update 2: maybe, to demonstrate my point, I should quote some hilarious examples of faulty thinking from the article I linked to. Here we go:
19 Three is not an object at all, but an essence; not a thing, but a thought; not a particular, but a universal.
28 The number three is neither an idle Platonic universal, nor a blank Lockean substratum; it is a concrete and specific energy in things, and can be detected at work in such observable processes as combustion.
32 Since the properties of three are intelligible, and intelligibles can exist only in the intellect, the properties of three exist only in the intellect.
35 We get the concept of three only through the transcendental unity of our intuitions as being successive in time.
Ring any bells?
If you think Vladimir is being opaque with his writing, and you disagree with his conclusion, that is not the same as asserting that he’s writing nonsense. Charity (and the evidence of his usual clarity) demand that you ask for clarification before accusing him of such.
Actually, I thought that I made a relatively clear argument, and I’m surprised that it’s not upvoted (the same goes for the follow-up here). Maybe someone could constructively comment on why that is. I expect that the argument is not easy to understand, and maybe I failed at seeing the inferential distance between my argument and intended audience, so that people who understood the argument already consider it too obvious to be of notice, and people who disagree with the conclusion didn’t understand the argument… Anyway, any constructive feedback on meta level would be appreciated.
On the concept of avoiders, see Dennett’s lecture here. Maybe someone can give a reference in textual form.
Uh...
AllanCrossman asked: what if we can’t precommit?
You answered: it’s good to be able to precommit, maybe we can still arrange it somehow.
Thus simplified, it doesn’t look like an answer. But you didn’t say it in simple words. You added philosophical fog that, when parsed and executed, completely cancels out, giving us no indication how to actually precommit.
Disagree?
My reply can be summarized as explaining why “precommiting in binding way” is not a clear-cut necessity for this problem. If you are a cooperator, there is no need to precommit.
In your terms, being a cooperator for this specific problem is synonymous to precommitting. You’re just shunting words around. All right, how do I actually be a cooperator?
No, it’s not synonymous. If you precommit, you become a cooperator, but you can also be one without precommiting. If you are an AI that is written to be a cooperator, you’ll be one. If you decide to act as a cooperator, you may be one. Being a cooperator is relatively easy. Being a cooperator and successfully signating that you are one, without precommitment, is in practice much harder. And a related problem, if you are a cooperator, you have to recognize a signal that the other person is a cooperator also, which may be too hard if he hasn’t precommited.
What? The implication goes both ways. If you’re a cooperator (in your terms), then you’re precommitted to cooperating (in classical terms). Maybe you misunderstand the word “precommitment”? It doesn’t necessarily imply that some natural power forces the other guy to believe you.
If you define precommitment this way, then every property becomes a precommitment to having that property, and the concept of precommitment becomes tautological. For example, is it a precommitment to always prefer good over evil (defined however you like)?
Not every property. Every immutable property. They’re very rare. Your example isn’t a precommitment because it’s not immutable.
What’s “mutable”? Changing in time? Cooperation may be a one-off encounter, with no multiple occasions to change over. You may be a cooperator for the duration of one encounter, and a rock elsewhere. Every fact is immutable, so I don’t know what you imply here.
Yes, mutable means changing in time.
Precommitment is an interaction between two different times: the time when you’re doing cheap talk with the opponent, and the time when you’re actually deciding in the closed room. The time you burn your ships, and the time your troops go to battle. Signaling time and play time. If a property is immutable (preferably physically immutable) between those two times, that’s precommitment. Sounds synonymous to your “being a cooperator” concept.
In other words, my point is that if the signaling is about your future property, at the moment when you have to perform the promised behavior, there is no need for any kind of persistence, thus according to your definition precommitment is unnecessary. Likewise, signaling doesn’t need to consist in you presenting any kind of argument, it may already be known that you (will be) a cooperator.
For example, the agent in question may be selected from a register of cooperators, where 99% of them are known to be cooperators. And cooperators themselves might as well be human, who decided to follow this counterintuitive algorithm, and benefit from doing so when interacting with other known cooperators, without any tangible precommitment system in place, no punishment for not being cooperators. This example may be implemented through reputation system.
No such thing as future property. This isn’t a factual disagreement on my part, just a quibble over terms; disregard it.
Your example isn’t about signaling or precommitment, it’s changing the game into multiple-shot, modifying the agent’s utility function in an isolated play to take into account their reputation for future plays. Yes, it works. But doesn’t help much in true one-shot (or last-play) situations.
On the other hand, the ideal platonic PD is also quite rare in reality—not as rare as Newcomb’s, but still. You may remember us having an isomorphic argument about Newcomb’s some time ago, with roles reversed—you defending the ideal platonic Newcomb’s Problem, and me questioning its assumptions :-)
Me, I don’t feel moral problems defecting in the pure one-shot PD. Some situations are just bad to be in, and the best way out is bad too. Especially situations where something terribly important to you is controlled by a cold uncaring alien entity, and the problem has been carefully constructed to prohibit you from manipulating it (Eliezer’s “true PD”).
In what sense do you mean no such thing? Clearly, there are future properties. My cat has a property of being dead in the future.
Yes, it was just an example of how to set up cooperation without precommitment. It’s clear that signaling being a one-off cooperator is a very hard problem, if you are only human and there are no Omegas flying around.
My cat has a property of being dead in the future.
Not with probability one, it doesn’t.
This doesn’t place the future in a privileged position. Even though I’m certain I saw my cat 10 minutes ago, it wasn’t alive a week ago with probability one, either.
Sorry. I deleted my comment to acknowledge my stupidity in making it. By now it’s clear that we don’t disagree substantively.
My answer to this would be that people have dispositions to behavior, and these dispositions color everything we do. If one might profit by showing courage, a coward will not do as well as a courageous man.
Of course, the relative success of such people at faking in appropriate situations is perhaps an empirical question.
ETA: this makes less sense as a direct response since you edited your comment. However, I think the difference is that “being a cooperator” regards a disposition that is part of the sort of person you are (though I think the above comment uses it more narrowly as a disposition that might only affect this one action), while a precommitment… well, I’m not sure actual people really do have those, if they’re immutable.
He is no fool either.
I don’t understand.
You need to make it clear how my intention to defect or my intention to cooperate influences the other guy’s actions, even if what I say to him is identical in both cases. Assume I’m a good liar.
Um… are you asserting that deception between humans is impossible?