I think this is also a rather contrived scenario, because if the UDT agents could change their own code (silently) cooperation would immediately break down, so it is reliant on the CDT agents being able to have different code from the most common and thus expected code silently, and the UDT agents not.
I’m not sure why you say “if the UDT agents could change their own code (silently) cooperation would immediately break down”, because in my view a UDT agent would reason that if it changed its code (to something like CDT for example), that logically implies other UDT agents also changing their code to do the same thing, so the expected utility of changing its code would be evaluated as lower than not changing its code. So it would remain a UDT agent and still cooperate with other UDT agents or when the probability of the other agent being UDT is high enough.
To me this example is about a CDT agent not wanting to become UDT-like if it found itself in a situation with many other UDT agents, which just seems puzzling if your previous perspective was that UDT is a clear advancement in decision theory and everyone should adopt UDT or become more UDT-like.
I think, if you had several UDT agents with the same source code, and then one UDT agent with slightly different source code, you might see the unique agent defect.
I think the CDT agent has an advantage here because it is capable of making distinct decisions from the rest of the population—not because it is CDT.
The general hope is that slight differences in source code (or even large differences, as long as they’re all using UDT or something close to it) wouldn’t be enough to make a UDT agents defect against another UDT agent (i.e. the logical correlation between their decisions would be high enough), otherwise “UDT agents cooperate with each other in one-shot PD” would be false or not have much practical implications, since why would all UDT agents have the exact same source code?
There are at least two potential sources of cooperation: symmetry and mutual source code knowledge; symmetry should be fragile to small changes in source code (I expect) as well as asymmetry between the situations of the different parties while mutual source code knowledge doesn’t require those sorts of symmetry at all (but does require knowledge).
Edit: for some reason my intuition expects cooperation from similarity to be less fragile in the Newcomb’s problem/code knowledge case (similarity to simulation) than if the similarity is just plain similarity to another, non-simulation agent. I need to think about why and if this has any connection to what would actually happen.
I did not realize that the UDT agents were assumed to behave identically; I was thinking that the cooperation was maintained, not by symmetry, but by mutual source code knowledge.
If it’s symmetry, well, if you can sneak a different agent into a clique without getting singled out, that’s an advantage. Again not a problem with UDT as such.
Edit: of course they do behave identically because they did have identical code (which was the source of the knowledge). (Though I don’t expect agents in the same decision theory class to be identical in the typical case).
I think this is also a rather contrived scenario, because if the UDT agents could change their own code (silently) cooperation would immediately break down, so it is reliant on the CDT agents being able to have different code from the most common and thus expected code silently, and the UDT agents not.
I’m not sure why you say “if the UDT agents could change their own code (silently) cooperation would immediately break down”, because in my view a UDT agent would reason that if it changed its code (to something like CDT for example), that logically implies other UDT agents also changing their code to do the same thing, so the expected utility of changing its code would be evaluated as lower than not changing its code. So it would remain a UDT agent and still cooperate with other UDT agents or when the probability of the other agent being UDT is high enough.
To me this example is about a CDT agent not wanting to become UDT-like if it found itself in a situation with many other UDT agents, which just seems puzzling if your previous perspective was that UDT is a clear advancement in decision theory and everyone should adopt UDT or become more UDT-like.
I think, if you had several UDT agents with the same source code, and then one UDT agent with slightly different source code, you might see the unique agent defect.
I think the CDT agent has an advantage here because it is capable of making distinct decisions from the rest of the population—not because it is CDT.
The general hope is that slight differences in source code (or even large differences, as long as they’re all using UDT or something close to it) wouldn’t be enough to make a UDT agents defect against another UDT agent (i.e. the logical correlation between their decisions would be high enough), otherwise “UDT agents cooperate with each other in one-shot PD” would be false or not have much practical implications, since why would all UDT agents have the exact same source code?
There are at least two potential sources of cooperation: symmetry and mutual source code knowledge; symmetry should be fragile to small changes in source code (I expect) as well as asymmetry between the situations of the different parties while mutual source code knowledge doesn’t require those sorts of symmetry at all (but does require knowledge).
Edit: for some reason my intuition expects cooperation from similarity to be less fragile in the Newcomb’s problem/code knowledge case (similarity to simulation) than if the similarity is just plain similarity to another, non-simulation agent. I need to think about why and if this has any connection to what would actually happen.
I mean, that’s a thing you might hope to be true. I’m not sure if it actually is true.
I did not realize that the UDT agents were assumed to behave identically; I was thinking that the cooperation was maintained, not by symmetry, but by mutual source code knowledge.
If it’s symmetry, well, if you can sneak a different agent into a clique without getting singled out, that’s an advantage. Again not a problem with UDT as such.
Edit: of course they do behave identically because they did have identical code (which was the source of the knowledge). (Though I don’t expect agents in the same decision theory class to be identical in the typical case).