It’s pretty clear that if your opponent in a symmetric game with the same information is running the exact same deterministic algorithm as you with the exact same computational resources, you’ll have to make the same move.
I think we’re returning to an argument I’ve had before, on free will. Yes, if you are running a deterministic algorithm, this is true. But the assumption that you are running a deterministic algorithm violates one of the prerequisite assumptions needed to even have this conversation: That you can choose an algorithm.
The way you’re describing TDT, it seems to assume that an agent doesn’t actually have a choice at the time of choosing an action.
An agent can make a precommitment, and that may be a good strategy. But only as dictated by game theory. Assuming that the agent does not make a choice during the game is not even game theory.
You are under the annoyingly common misconception that choice between A and B means that whether A or B is chosen is indeterminate (i. e. either incoherent free will magic or identifying yourself more strongly with your randomness source than your fixed qualities). What it actually means is that whether A or B is chosen depends on your preferences and your decision making process.
All right; I stated that incorrectly. We can regard the actions of deterministic agents as “choices”; we do it all the time in game theory. What I was trying to get at is not the choice of A or B, but the choice to use TDT.
It sounds to me like TDT is not addressing the question, “How do I convince an agent to adopt TDT?” It assumes that you’re designing agents, and you design them to implement TDT, so that they can get better results on PD when there are many of them.
But then it’s not fundamentally different from designing agents with a utility function that includes benefits to other agents. In other words, it’s smuggling in an ethical judgement, disguising it as a mathematical truth.
If you’re a human, why would you adopt TDT? The reason must be one that is not an answer to “why would you cooperate?”, nor to “why would you tell people you have adopted TDT?”
You might be able to prove that utility functions and decision theories are equivalent; meaning that any change to a utility function could alternatively be implemented as a change to the decision theory, and vice-versa. Thus, asking someone to change a decision theory may be nonsensical.
So, maybe the answer I’m looking for is that TDT is not a thing meant to be offered to humans; it’s a more-parsimonious way than ethics of arriving at cooperation on PD. That would be acceptable.
But to really be useful, TDT must be an evolutionarily-stable decision theory. I’m skeptical of that. The “ethics” approach lets evolution pick and choose only those values that help the agent. The TDT approach is not so flexible; it may force the agent to cooperate in situations where it is not beneficial to the agent.
It’s true that people just “make decisions”, and don’t have access to their source code. They can, however lift decisions to the conscious level and decide to follow an algorithm.
Do you have any problem with economists convincing people that it’s in their best interest to figure out what the Nash equilibrium is and play that? They are arguing for an algorithm, and people can indeed decide to “follow the algorithm” rather than “just picking their gut choice”, can’t they? At least, when they have explicit payoffs available, etc.
Really, TDT is just one specific example that if you know your opponents decision procedure, you can do better than not knowing[1]. In a one-shot PD against an unknown, the best I can do really is defensively playing Defect. This is true even if I know what move he’ll make. On the other hand, If I don’t know his move, but know that he will make the same move as I, whether because he’s copying me, or we’re both “implementing the same algorithm”, we can effectively together force “cooperate”.
Now, “implementing the same algorithm” is actually a somewhat vague idea. I listed several ways that even computer implementations couldn’t guarantee enough symmetry. People are far worse, of course. A conscious decision to do what the algorithm says will often have people backing out because it’s saying to do something crazy. We can’t implement TDT, only approximate it. Are the approximations good enough to be useful? I find that idea implausible.
If you’re a human, why would you adopt TDT?
I wouldn’t because IMHO, it won’t ever be useful to me. I can’t trust that the situations are actually symmetric enough. To uploads? Maybe. To AIs who can examine and certify each others source code? Quite possibly.
EDITTED:
[1]: Okay, it’s a bit more than that, it matters not just when playing clones, but also when you need to know how future versions of you will behave in “omega” situations. I also don’t expect to need to deal with that, though I am a one-boxer.
What I was trying to get at is not the choice of A or B, but the choice to use TDT.
The same as any other choice: Assuming that you are calculating whether you prefer to adopt TDT and preferring to adopt TDT results in you adopting TDT you have the choice to adopt TDT.
If you’re a human, why would you adopt TDT? The reason must be one that is not an answer to “why would you cooperate?”, nor to “why would you tell people you have adopted TDT?”
You might be able to prove that utility functions and decision theories are equivalent;
I certainly don’t believe that. (I’m making the simplifying assumption of consequentialism rather than some other value system tortured into being represented as an utility function.) The utility function is what assigns utilities to various possible states of the world (in the widest possible sense), the decision theories differ in how they link the possible choices to the possible states of the world, not in the utilities of those states.
An agent chooses to change decision theories if their preference, calculated according to their current utility function and decision theory, is to change their decision theory and this results in them changing their decision theory. I’m not sure in how far that applies to humans. For them it may be more like realizing that TDT is a closer approximation of how their decision making process actually functions given the correct input, and that in as far as their decision making was previously approximated by another decision theory it was distorted by an oversimplified understanding of the world.
But the assumption that you are running a deterministic algorithm violates one of the prerequisite assumptions needed to even have this conversation: That you can choose an algorithm.
At least Eliezer’s interest in this, as I understand it, is partly motivated by his work on artificial intelligence. An AI with write-access to its own code can choose algorithms (even if it is deterministic). I don’t know how much this applies to us (although if you can brainwash or hypnotise yourself, it’s not irrelevant), but it applies to them.
I think we’re returning to an argument I’ve had before, on free will. Yes, if you are running a deterministic algorithm, this is true. But the assumption that you are running a deterministic algorithm violates one of the prerequisite assumptions needed to even have this conversation: That you can choose an algorithm.
The way you’re describing TDT, it seems to assume that an agent doesn’t actually have a choice at the time of choosing an action.
An agent can make a precommitment, and that may be a good strategy. But only as dictated by game theory. Assuming that the agent does not make a choice during the game is not even game theory.
You are under the annoyingly common misconception that choice between A and B means that whether A or B is chosen is indeterminate (i. e. either incoherent free will magic or identifying yourself more strongly with your randomness source than your fixed qualities). What it actually means is that whether A or B is chosen depends on your preferences and your decision making process.
All right; I stated that incorrectly. We can regard the actions of deterministic agents as “choices”; we do it all the time in game theory. What I was trying to get at is not the choice of A or B, but the choice to use TDT.
It sounds to me like TDT is not addressing the question, “How do I convince an agent to adopt TDT?” It assumes that you’re designing agents, and you design them to implement TDT, so that they can get better results on PD when there are many of them.
But then it’s not fundamentally different from designing agents with a utility function that includes benefits to other agents. In other words, it’s smuggling in an ethical judgement, disguising it as a mathematical truth.
If you’re a human, why would you adopt TDT? The reason must be one that is not an answer to “why would you cooperate?”, nor to “why would you tell people you have adopted TDT?”
You might be able to prove that utility functions and decision theories are equivalent; meaning that any change to a utility function could alternatively be implemented as a change to the decision theory, and vice-versa. Thus, asking someone to change a decision theory may be nonsensical.
So, maybe the answer I’m looking for is that TDT is not a thing meant to be offered to humans; it’s a more-parsimonious way than ethics of arriving at cooperation on PD. That would be acceptable.
But to really be useful, TDT must be an evolutionarily-stable decision theory. I’m skeptical of that. The “ethics” approach lets evolution pick and choose only those values that help the agent. The TDT approach is not so flexible; it may force the agent to cooperate in situations where it is not beneficial to the agent.
It’s true that people just “make decisions”, and don’t have access to their source code. They can, however lift decisions to the conscious level and decide to follow an algorithm.
Do you have any problem with economists convincing people that it’s in their best interest to figure out what the Nash equilibrium is and play that? They are arguing for an algorithm, and people can indeed decide to “follow the algorithm” rather than “just picking their gut choice”, can’t they? At least, when they have explicit payoffs available, etc.
Really, TDT is just one specific example that if you know your opponents decision procedure, you can do better than not knowing[1]. In a one-shot PD against an unknown, the best I can do really is defensively playing Defect. This is true even if I know what move he’ll make. On the other hand, If I don’t know his move, but know that he will make the same move as I, whether because he’s copying me, or we’re both “implementing the same algorithm”, we can effectively together force “cooperate”.
Now, “implementing the same algorithm” is actually a somewhat vague idea. I listed several ways that even computer implementations couldn’t guarantee enough symmetry. People are far worse, of course. A conscious decision to do what the algorithm says will often have people backing out because it’s saying to do something crazy. We can’t implement TDT, only approximate it. Are the approximations good enough to be useful? I find that idea implausible.
I wouldn’t because IMHO, it won’t ever be useful to me. I can’t trust that the situations are actually symmetric enough. To uploads? Maybe. To AIs who can examine and certify each others source code? Quite possibly.
EDITTED: [1]: Okay, it’s a bit more than that, it matters not just when playing clones, but also when you need to know how future versions of you will behave in “omega” situations. I also don’t expect to need to deal with that, though I am a one-boxer.
The same as any other choice: Assuming that you are calculating whether you prefer to adopt TDT and preferring to adopt TDT results in you adopting TDT you have the choice to adopt TDT.
I certainly don’t believe that. (I’m making the simplifying assumption of consequentialism rather than some other value system tortured into being represented as an utility function.) The utility function is what assigns utilities to various possible states of the world (in the widest possible sense), the decision theories differ in how they link the possible choices to the possible states of the world, not in the utilities of those states.
An agent chooses to change decision theories if their preference, calculated according to their current utility function and decision theory, is to change their decision theory and this results in them changing their decision theory. I’m not sure in how far that applies to humans. For them it may be more like realizing that TDT is a closer approximation of how their decision making process actually functions given the correct input, and that in as far as their decision making was previously approximated by another decision theory it was distorted by an oversimplified understanding of the world.
At least Eliezer’s interest in this, as I understand it, is partly motivated by his work on artificial intelligence. An AI with write-access to its own code can choose algorithms (even if it is deterministic). I don’t know how much this applies to us (although if you can brainwash or hypnotise yourself, it’s not irrelevant), but it applies to them.