It’s pretty clear that if your opponent in a symmetric game with the same information is running the exact same deterministic algorithm as you with the exact same computational resources, you’ll have to make the same move. This essentially “vanishes” the off-diagonal boxes. The TDT proponents want to take advantage of this to have a “better, more winning” decision theory, which would give both of them the (C, C) payoff rather than (D, D) even in a one shot prisoner’s dilemma.. Grafting this on by itself currently buys little, as there is never this degree of symmetry.
Now, suppose we are playing a symmetric game and 90% confident that we are playing a clone with the exact same computational resources, and 10% confident that we’re playing some one random (and that if we’re playing a clone that it’s 90% confident that we’re a clone that’s 90% confident that …) What do we do in this case?
I claim that in this case we can still take advantage of this, as long as the probability is high enough relative to the utility losses. We need to do both the calculations for “they act independently” (Assume they choose the Nash equilibrium, if we don’t know anything else about their decision-making) and “they act the same” (diagonal) and merge them. The proper thing as always, is choosing based on the expected utility, which is just the probability weighted utility for each choice.
For the standard prisoner’s dilemma, where (C, C) = (3, 3), (C, D) = (0, 5), (D, C) = (5, 0), and (D, D) = 1. We can see that no matter what your opponent chooses, it’s better to Defect than cooperate, so the Nash equilibrium is the pure strategy D. If, on the other hand, your opponent is a computational clone,
you choosing to Cooperate gets you the (C,C) box, not the (C, D) box. So, we have 0.9 3 + 0.1 0 = 2.7 for Cooperate, and 0.9 1 + 0.1 1 = 1for Defect. Cooperation is the better choice, and remains so up until p < 1⁄3.
I also claim this is pretty much the limit of what we can do. If the algorithm you share is non-deterministic (with different random number sources), then off-diagonal results are now possible. If the computational resources are different, then we are analyzing to a different recursive cutoff than our opponent, and so may come to different conclusions. If we have the same resources, but the problem is asymmetric, than we can’t simply say “they’ll do the same thing we will”. All the boxes must then be considered.
In some sense we also get to choose the algorithm we use. If the game is close enough to symmetric we can choose to play as if it is, and so long as that decision process is symmetric, we recover the result. If we were computers we could choose to avoid using true random number generators, and only use the same pseudorandom number generators (we do want to keep the ability to act randomly against non-clones).
We need to do both the calculations for “they act independently” (Assume they choose the Nash equilibrium, if we don’t know anything else about their decision-making)
In order to figure out our best move, we need to something about how the other person will move. It’s generally a good idea to assume your opponent is indeed rational and utility maximizing. The Nash equilibrium strategy is the one that is stable such that neither side can do better by switching, which gives it a great deal of stablity. As such, it’s often a good model for what people will do.
In this case, the full machinery isn’t necessary. Defect strictly dominates Cooperate in every choice, so barring special considerations like TDT, it’s what anyone with a modicum of sense will do.
I don’t have any direct citations to current social science, no, but I’ll tell a plausible story, and give some indirect citations.
Often? Yes. Always or near always? No. It depends crucially on the complexity of the game, the familiarity of the person playing the game, and the intelligence of the people playing.
Most people playing a game iteratively update their strategies with each game, learning both which moves of theirs worked better, and what their opponents are likely to do.
If both sides constantly do these updates, they are driven towards a Nash equilibrium. (It might be better to say they are driven away from anything that is not a Nash equilibrium.)
The definition of a Nash equilibrium is a combination of strategies where neither side can do better by unilaterally altering their own move. If one side can do better by altering their own move, and realizes it, they will.
Complexity makes it harder for people to explore the space and find the Nash equilibrium. The more familiar with the game, the more likely you are to have found that neighborhood and play near it.
The smarter you are the faster this happens—you can essentially model your opponent to figure out what they’ll do, and respond. But your opponent can model you as well, so you must include that in your model of him, and so forth.
A surprisingly large number of people are essentially “level 0” modelers, who aren’t influenced at all by a model of what their opponent does until they have gained data on it. They may use the “maximin” strategy that says pick the choice that maximizes your gain if in each of your possible choices your opponent does what helps you the least—brutal pessimism. Similarly there is an optimistic “maximax” strategy—pick the choice that maximizes your gain if in each of your possible choices, you opponent chooses what is best for you. Or there is the expected value over a flat distribution of the opponent’s choices. And ones that output a number for each choice can be combined with some weighting. There are many other possibilities, of course, but if there is one choice that strictly dominates another (better for every value of the opponent’s choice), they should not pick the one that is strictly dominated.
Another large number are “level 1” modelers—they figure the other guy will do something given by one of these “level 0“ models. There are a few “level 2” modelers that model the other guy as “level 1” , and level 3, and so forth. The Nash equilibria are the stable fixed points of this process, so are what “high enough” people will do.
(Note that this process may not converge unless it starts at a fixed point—consider rock, paper, scissors. But doing the equivalent of Cesàro summation will make it converge to the unique mixed strategy of randomly picking one).
In practice, you want to be exactly one level higher than your opponent. It is, of course, possible to model your opponent as a probabilistic mixture of these, though your best response is (in general) not going to be a probabilistic mixture of levels one-higher. And the best response to you modeling like that will not be a simple mixture of levels either.
So, why do I say many are level 0, and level 1? Well, consider the Guess 2⁄3 of the average game. People are restricted to numbers between 0 and 100. The person guessing closest to 2⁄3 of the mean wins (utility 1), and everyone else loses (utility 0), (pick the winner randomly in case of ties). What will a level 0 modeler do? The maximin strategy gives no restriction—you can always lose. The maximax strategy eliminates everything above 66 2⁄3, because that’s the maximum the average can possibly be. The equiprobabal expected value strategy puts the mean at 50, and suggests 33 1⁄3 (getting more peaked the more people there are and the more they are modeled as independent). A level 1 modeler realizes that everybody else should know at least this, so will probably guess around 2⁄3 * 33 1⁄3 = 22, perhaps higher realizing that some level 0s won’t go through even the utility analysis and pick randomly. A level 2 modeler will be 2⁄3 of this, at roughly 14 or 15. This obviously converges to 0 for a “level infinity” modeler, and this is the Nash equilibrium.
But of course, very few people pick near 0, so it is not a good idea to pick 0. What is it rational to pick? From the link, a newspaper ran this with a prize, and the winner was at 21.6 (so the average was around 32.4).
http://museumofmoney.org/exhibitions/games/numberpop.html references a study with college students winning with 24 (average of 36!), and a financial newspaper winning with 13 (average of 19.5). When I’ve seen histograms, they tend to have a spike at 33 1/3, indicating lots of pretty directly level 0. Curiously, there also tends to be a spike at 66 2/3 indicating a fair number really not quite understanding the game.
In this case, with an unfamiliar game, playing the Nash equilibrium is not optimal, and people don’t do it. Levels 1-3 seem to be what wins in this case, with the majority effectively playing at levels 0-2. But I can guarantee that played multiple times with the same people this will go down to 0 quite rapidly. Tautologically, people are more familiar with the games they play more often, and will in practice be effectively higher—not because they explicitly model higher levels, but because the “level 0” models are not random, but incorporate how people have played before (rather than how they should play now).
For the specific case of the Prisoner’s Dilemma, all of the “level 0” strategies pick Defect, which is the Nash equilibrium. Even so, http://en.wikipedia.org/wiki/Prisoner’s_dilemma#cite_note-2 claims that 40% cooperate. I would expect that this is from some innate valuing of fairness so that the rewards they get are not actually their utilities for those outcomes, but this is not clear.
EDITED: links fixed, and a bit of clarifications and grammar rewrites.
It’s generally a good idea to assume your opponent is indeed rational and utility maximizing.
Why? I’m not just asking rhetorical questions—your opponent may be non-rational in one of a trillion of very realistic, very common ways: maybe he’ll cooperate because of his personal moral/religious views, maybe he’ll defect because he doesn’t want to think of himself as a ‘sucker’. Or maybe he is rational but has made a mistake somewhere along the way of his musings on PD.
If all your formulation says is that you’re playing against “someone random”, at a minimum this means a randomly chosen human, most of which have a terribly flawed rational process. At a maximum it means any randomly-chosen or randomly-generated entity capable of picking an option—you could be playing against Eliza or Paul the Octopus for all you know.
Also, if you are going to assume that he is rational and not mistaken, why assume that he is just rational enough to do the obvious, zero-depth payoff analysis, but not rational enough to have a model any more sophisticated than that? (indeed, why should TDT be discarded as a “special consideration”?)
Why? … your opponent may be non-rational in one of a trillion of very realistic, very common ways …
This is a fair point. For games people are not familiar with, have not played to death, this is absolutely true. For the games people have played a lot of, (or have had their genes and memes evolved under and contributing towards their moves), anything but what the Nash equilibria does must get outcompeted. Playing poker against a novice, there should be options much better than Nash. Against a pro? Not so much. Against a hustler (who isn’t actually cheating)? Well, they’re optimized to take advantage of novices, by leading them into bigger bets, so you can probably take advantage of this by not scaring them off by playing too well too soon.
maybe he’ll cooperate because of his personal moral/religious views, maybe he’ll defect because he doesn’t want to think of himself as a ‘sucker’.
These are, of course, part of the utility function—and if you don’t know that modeling is a bitch.
most of which have a terribly flawed rational process.
Most people do not play by reasoning it out to any great depth at a conscious level. They play by gut instinct, set by genes (and memes) and shapened by experience. For games where the genes are relevant, this is going to push towards Nash. Experience is also going to push towards Nash.
It’s pretty clear that if your opponent in a symmetric game with the same information is running the exact same deterministic algorithm as you with the exact same computational resources, you’ll have to make the same move.
I think we’re returning to an argument I’ve had before, on free will. Yes, if you are running a deterministic algorithm, this is true. But the assumption that you are running a deterministic algorithm violates one of the prerequisite assumptions needed to even have this conversation: That you can choose an algorithm.
The way you’re describing TDT, it seems to assume that an agent doesn’t actually have a choice at the time of choosing an action.
An agent can make a precommitment, and that may be a good strategy. But only as dictated by game theory. Assuming that the agent does not make a choice during the game is not even game theory.
You are under the annoyingly common misconception that choice between A and B means that whether A or B is chosen is indeterminate (i. e. either incoherent free will magic or identifying yourself more strongly with your randomness source than your fixed qualities). What it actually means is that whether A or B is chosen depends on your preferences and your decision making process.
All right; I stated that incorrectly. We can regard the actions of deterministic agents as “choices”; we do it all the time in game theory. What I was trying to get at is not the choice of A or B, but the choice to use TDT.
It sounds to me like TDT is not addressing the question, “How do I convince an agent to adopt TDT?” It assumes that you’re designing agents, and you design them to implement TDT, so that they can get better results on PD when there are many of them.
But then it’s not fundamentally different from designing agents with a utility function that includes benefits to other agents. In other words, it’s smuggling in an ethical judgement, disguising it as a mathematical truth.
If you’re a human, why would you adopt TDT? The reason must be one that is not an answer to “why would you cooperate?”, nor to “why would you tell people you have adopted TDT?”
You might be able to prove that utility functions and decision theories are equivalent; meaning that any change to a utility function could alternatively be implemented as a change to the decision theory, and vice-versa. Thus, asking someone to change a decision theory may be nonsensical.
So, maybe the answer I’m looking for is that TDT is not a thing meant to be offered to humans; it’s a more-parsimonious way than ethics of arriving at cooperation on PD. That would be acceptable.
But to really be useful, TDT must be an evolutionarily-stable decision theory. I’m skeptical of that. The “ethics” approach lets evolution pick and choose only those values that help the agent. The TDT approach is not so flexible; it may force the agent to cooperate in situations where it is not beneficial to the agent.
It’s true that people just “make decisions”, and don’t have access to their source code. They can, however lift decisions to the conscious level and decide to follow an algorithm.
Do you have any problem with economists convincing people that it’s in their best interest to figure out what the Nash equilibrium is and play that? They are arguing for an algorithm, and people can indeed decide to “follow the algorithm” rather than “just picking their gut choice”, can’t they? At least, when they have explicit payoffs available, etc.
Really, TDT is just one specific example that if you know your opponents decision procedure, you can do better than not knowing[1]. In a one-shot PD against an unknown, the best I can do really is defensively playing Defect. This is true even if I know what move he’ll make. On the other hand, If I don’t know his move, but know that he will make the same move as I, whether because he’s copying me, or we’re both “implementing the same algorithm”, we can effectively together force “cooperate”.
Now, “implementing the same algorithm” is actually a somewhat vague idea. I listed several ways that even computer implementations couldn’t guarantee enough symmetry. People are far worse, of course. A conscious decision to do what the algorithm says will often have people backing out because it’s saying to do something crazy. We can’t implement TDT, only approximate it. Are the approximations good enough to be useful? I find that idea implausible.
If you’re a human, why would you adopt TDT?
I wouldn’t because IMHO, it won’t ever be useful to me. I can’t trust that the situations are actually symmetric enough. To uploads? Maybe. To AIs who can examine and certify each others source code? Quite possibly.
EDITTED:
[1]: Okay, it’s a bit more than that, it matters not just when playing clones, but also when you need to know how future versions of you will behave in “omega” situations. I also don’t expect to need to deal with that, though I am a one-boxer.
What I was trying to get at is not the choice of A or B, but the choice to use TDT.
The same as any other choice: Assuming that you are calculating whether you prefer to adopt TDT and preferring to adopt TDT results in you adopting TDT you have the choice to adopt TDT.
If you’re a human, why would you adopt TDT? The reason must be one that is not an answer to “why would you cooperate?”, nor to “why would you tell people you have adopted TDT?”
You might be able to prove that utility functions and decision theories are equivalent;
I certainly don’t believe that. (I’m making the simplifying assumption of consequentialism rather than some other value system tortured into being represented as an utility function.) The utility function is what assigns utilities to various possible states of the world (in the widest possible sense), the decision theories differ in how they link the possible choices to the possible states of the world, not in the utilities of those states.
An agent chooses to change decision theories if their preference, calculated according to their current utility function and decision theory, is to change their decision theory and this results in them changing their decision theory. I’m not sure in how far that applies to humans. For them it may be more like realizing that TDT is a closer approximation of how their decision making process actually functions given the correct input, and that in as far as their decision making was previously approximated by another decision theory it was distorted by an oversimplified understanding of the world.
But the assumption that you are running a deterministic algorithm violates one of the prerequisite assumptions needed to even have this conversation: That you can choose an algorithm.
At least Eliezer’s interest in this, as I understand it, is partly motivated by his work on artificial intelligence. An AI with write-access to its own code can choose algorithms (even if it is deterministic). I don’t know how much this applies to us (although if you can brainwash or hypnotise yourself, it’s not irrelevant), but it applies to them.
It’s pretty clear that if your opponent in a symmetric game with the same information is running the exact same deterministic algorithm as you with the exact same computational resources, you’ll have to make the same move. This essentially “vanishes” the off-diagonal boxes. The TDT proponents want to take advantage of this to have a “better, more winning” decision theory, which would give both of them the (C, C) payoff rather than (D, D) even in a one shot prisoner’s dilemma.. Grafting this on by itself currently buys little, as there is never this degree of symmetry.
Now, suppose we are playing a symmetric game and 90% confident that we are playing a clone with the exact same computational resources, and 10% confident that we’re playing some one random (and that if we’re playing a clone that it’s 90% confident that we’re a clone that’s 90% confident that …) What do we do in this case?
I claim that in this case we can still take advantage of this, as long as the probability is high enough relative to the utility losses. We need to do both the calculations for “they act independently” (Assume they choose the Nash equilibrium, if we don’t know anything else about their decision-making) and “they act the same” (diagonal) and merge them. The proper thing as always, is choosing based on the expected utility, which is just the probability weighted utility for each choice.
For the standard prisoner’s dilemma, where (C, C) = (3, 3), (C, D) = (0, 5), (D, C) = (5, 0), and (D, D) = 1. We can see that no matter what your opponent chooses, it’s better to Defect than cooperate, so the Nash equilibrium is the pure strategy D. If, on the other hand, your opponent is a computational clone, you choosing to Cooperate gets you the (C,C) box, not the (C, D) box. So, we have 0.9 3 + 0.1 0 = 2.7 for Cooperate, and 0.9 1 + 0.1 1 = 1for Defect. Cooperation is the better choice, and remains so up until p < 1⁄3.
I also claim this is pretty much the limit of what we can do. If the algorithm you share is non-deterministic (with different random number sources), then off-diagonal results are now possible. If the computational resources are different, then we are analyzing to a different recursive cutoff than our opponent, and so may come to different conclusions. If we have the same resources, but the problem is asymmetric, than we can’t simply say “they’ll do the same thing we will”. All the boxes must then be considered.
In some sense we also get to choose the algorithm we use. If the game is close enough to symmetric we can choose to play as if it is, and so long as that decision process is symmetric, we recover the result. If we were computers we could choose to avoid using true random number generators, and only use the same pseudorandom number generators (we do want to keep the ability to act randomly against non-clones).
Why do we assume that?
In order to figure out our best move, we need to something about how the other person will move. It’s generally a good idea to assume your opponent is indeed rational and utility maximizing. The Nash equilibrium strategy is the one that is stable such that neither side can do better by switching, which gives it a great deal of stablity. As such, it’s often a good model for what people will do.
In this case, the full machinery isn’t necessary. Defect strictly dominates Cooperate in every choice, so barring special considerations like TDT, it’s what anyone with a modicum of sense will do.
Is this a fact? Where can I read about the evidence for this?
I don’t have any direct citations to current social science, no, but I’ll tell a plausible story, and give some indirect citations.
Often? Yes. Always or near always? No. It depends crucially on the complexity of the game, the familiarity of the person playing the game, and the intelligence of the people playing.
Most people playing a game iteratively update their strategies with each game, learning both which moves of theirs worked better, and what their opponents are likely to do.
If both sides constantly do these updates, they are driven towards a Nash equilibrium. (It might be better to say they are driven away from anything that is not a Nash equilibrium.) The definition of a Nash equilibrium is a combination of strategies where neither side can do better by unilaterally altering their own move. If one side can do better by altering their own move, and realizes it, they will.
Complexity makes it harder for people to explore the space and find the Nash equilibrium. The more familiar with the game, the more likely you are to have found that neighborhood and play near it.
The smarter you are the faster this happens—you can essentially model your opponent to figure out what they’ll do, and respond. But your opponent can model you as well, so you must include that in your model of him, and so forth.
A surprisingly large number of people are essentially “level 0” modelers, who aren’t influenced at all by a model of what their opponent does until they have gained data on it. They may use the “maximin” strategy that says pick the choice that maximizes your gain if in each of your possible choices your opponent does what helps you the least—brutal pessimism. Similarly there is an optimistic “maximax” strategy—pick the choice that maximizes your gain if in each of your possible choices, you opponent chooses what is best for you. Or there is the expected value over a flat distribution of the opponent’s choices. And ones that output a number for each choice can be combined with some weighting. There are many other possibilities, of course, but if there is one choice that strictly dominates another (better for every value of the opponent’s choice), they should not pick the one that is strictly dominated.
Another large number are “level 1” modelers—they figure the other guy will do something given by one of these “level 0“ models. There are a few “level 2” modelers that model the other guy as “level 1” , and level 3, and so forth. The Nash equilibria are the stable fixed points of this process, so are what “high enough” people will do. (Note that this process may not converge unless it starts at a fixed point—consider rock, paper, scissors. But doing the equivalent of Cesàro summation will make it converge to the unique mixed strategy of randomly picking one).
In practice, you want to be exactly one level higher than your opponent. It is, of course, possible to model your opponent as a probabilistic mixture of these, though your best response is (in general) not going to be a probabilistic mixture of levels one-higher. And the best response to you modeling like that will not be a simple mixture of levels either.
So, why do I say many are level 0, and level 1? Well, consider the Guess 2⁄3 of the average game. People are restricted to numbers between 0 and 100. The person guessing closest to 2⁄3 of the mean wins (utility 1), and everyone else loses (utility 0), (pick the winner randomly in case of ties). What will a level 0 modeler do? The maximin strategy gives no restriction—you can always lose. The maximax strategy eliminates everything above 66 2⁄3, because that’s the maximum the average can possibly be. The equiprobabal expected value strategy puts the mean at 50, and suggests 33 1⁄3 (getting more peaked the more people there are and the more they are modeled as independent). A level 1 modeler realizes that everybody else should know at least this, so will probably guess around 2⁄3 * 33 1⁄3 = 22, perhaps higher realizing that some level 0s won’t go through even the utility analysis and pick randomly. A level 2 modeler will be 2⁄3 of this, at roughly 14 or 15. This obviously converges to 0 for a “level infinity” modeler, and this is the Nash equilibrium.
But of course, very few people pick near 0, so it is not a good idea to pick 0. What is it rational to pick? From the link, a newspaper ran this with a prize, and the winner was at 21.6 (so the average was around 32.4). http://museumofmoney.org/exhibitions/games/numberpop.html references a study with college students winning with 24 (average of 36!), and a financial newspaper winning with 13 (average of 19.5). When I’ve seen histograms, they tend to have a spike at 33 1/3, indicating lots of pretty directly level 0. Curiously, there also tends to be a spike at 66 2/3 indicating a fair number really not quite understanding the game.
In this case, with an unfamiliar game, playing the Nash equilibrium is not optimal, and people don’t do it. Levels 1-3 seem to be what wins in this case, with the majority effectively playing at levels 0-2. But I can guarantee that played multiple times with the same people this will go down to 0 quite rapidly. Tautologically, people are more familiar with the games they play more often, and will in practice be effectively higher—not because they explicitly model higher levels, but because the “level 0” models are not random, but incorporate how people have played before (rather than how they should play now).
For the specific case of the Prisoner’s Dilemma, all of the “level 0” strategies pick Defect, which is the Nash equilibrium. Even so, http://en.wikipedia.org/wiki/Prisoner’s_dilemma#cite_note-2 claims that 40% cooperate. I would expect that this is from some innate valuing of fairness so that the rewards they get are not actually their utilities for those outcomes, but this is not clear.
EDITED: links fixed, and a bit of clarifications and grammar rewrites.
Thanks; upvoted.
Why? I’m not just asking rhetorical questions—your opponent may be non-rational in one of a trillion of very realistic, very common ways: maybe he’ll cooperate because of his personal moral/religious views, maybe he’ll defect because he doesn’t want to think of himself as a ‘sucker’. Or maybe he is rational but has made a mistake somewhere along the way of his musings on PD.
If all your formulation says is that you’re playing against “someone random”, at a minimum this means a randomly chosen human, most of which have a terribly flawed rational process. At a maximum it means any randomly-chosen or randomly-generated entity capable of picking an option—you could be playing against Eliza or Paul the Octopus for all you know.
Also, if you are going to assume that he is rational and not mistaken, why assume that he is just rational enough to do the obvious, zero-depth payoff analysis, but not rational enough to have a model any more sophisticated than that? (indeed, why should TDT be discarded as a “special consideration”?)
This is a fair point. For games people are not familiar with, have not played to death, this is absolutely true. For the games people have played a lot of, (or have had their genes and memes evolved under and contributing towards their moves), anything but what the Nash equilibria does must get outcompeted. Playing poker against a novice, there should be options much better than Nash. Against a pro? Not so much. Against a hustler (who isn’t actually cheating)? Well, they’re optimized to take advantage of novices, by leading them into bigger bets, so you can probably take advantage of this by not scaring them off by playing too well too soon.
These are, of course, part of the utility function—and if you don’t know that modeling is a bitch.
Most people do not play by reasoning it out to any great depth at a conscious level. They play by gut instinct, set by genes (and memes) and shapened by experience. For games where the genes are relevant, this is going to push towards Nash. Experience is also going to push towards Nash.
I think we’re returning to an argument I’ve had before, on free will. Yes, if you are running a deterministic algorithm, this is true. But the assumption that you are running a deterministic algorithm violates one of the prerequisite assumptions needed to even have this conversation: That you can choose an algorithm.
The way you’re describing TDT, it seems to assume that an agent doesn’t actually have a choice at the time of choosing an action.
An agent can make a precommitment, and that may be a good strategy. But only as dictated by game theory. Assuming that the agent does not make a choice during the game is not even game theory.
You are under the annoyingly common misconception that choice between A and B means that whether A or B is chosen is indeterminate (i. e. either incoherent free will magic or identifying yourself more strongly with your randomness source than your fixed qualities). What it actually means is that whether A or B is chosen depends on your preferences and your decision making process.
All right; I stated that incorrectly. We can regard the actions of deterministic agents as “choices”; we do it all the time in game theory. What I was trying to get at is not the choice of A or B, but the choice to use TDT.
It sounds to me like TDT is not addressing the question, “How do I convince an agent to adopt TDT?” It assumes that you’re designing agents, and you design them to implement TDT, so that they can get better results on PD when there are many of them.
But then it’s not fundamentally different from designing agents with a utility function that includes benefits to other agents. In other words, it’s smuggling in an ethical judgement, disguising it as a mathematical truth.
If you’re a human, why would you adopt TDT? The reason must be one that is not an answer to “why would you cooperate?”, nor to “why would you tell people you have adopted TDT?”
You might be able to prove that utility functions and decision theories are equivalent; meaning that any change to a utility function could alternatively be implemented as a change to the decision theory, and vice-versa. Thus, asking someone to change a decision theory may be nonsensical.
So, maybe the answer I’m looking for is that TDT is not a thing meant to be offered to humans; it’s a more-parsimonious way than ethics of arriving at cooperation on PD. That would be acceptable.
But to really be useful, TDT must be an evolutionarily-stable decision theory. I’m skeptical of that. The “ethics” approach lets evolution pick and choose only those values that help the agent. The TDT approach is not so flexible; it may force the agent to cooperate in situations where it is not beneficial to the agent.
It’s true that people just “make decisions”, and don’t have access to their source code. They can, however lift decisions to the conscious level and decide to follow an algorithm.
Do you have any problem with economists convincing people that it’s in their best interest to figure out what the Nash equilibrium is and play that? They are arguing for an algorithm, and people can indeed decide to “follow the algorithm” rather than “just picking their gut choice”, can’t they? At least, when they have explicit payoffs available, etc.
Really, TDT is just one specific example that if you know your opponents decision procedure, you can do better than not knowing[1]. In a one-shot PD against an unknown, the best I can do really is defensively playing Defect. This is true even if I know what move he’ll make. On the other hand, If I don’t know his move, but know that he will make the same move as I, whether because he’s copying me, or we’re both “implementing the same algorithm”, we can effectively together force “cooperate”.
Now, “implementing the same algorithm” is actually a somewhat vague idea. I listed several ways that even computer implementations couldn’t guarantee enough symmetry. People are far worse, of course. A conscious decision to do what the algorithm says will often have people backing out because it’s saying to do something crazy. We can’t implement TDT, only approximate it. Are the approximations good enough to be useful? I find that idea implausible.
I wouldn’t because IMHO, it won’t ever be useful to me. I can’t trust that the situations are actually symmetric enough. To uploads? Maybe. To AIs who can examine and certify each others source code? Quite possibly.
EDITTED: [1]: Okay, it’s a bit more than that, it matters not just when playing clones, but also when you need to know how future versions of you will behave in “omega” situations. I also don’t expect to need to deal with that, though I am a one-boxer.
The same as any other choice: Assuming that you are calculating whether you prefer to adopt TDT and preferring to adopt TDT results in you adopting TDT you have the choice to adopt TDT.
I certainly don’t believe that. (I’m making the simplifying assumption of consequentialism rather than some other value system tortured into being represented as an utility function.) The utility function is what assigns utilities to various possible states of the world (in the widest possible sense), the decision theories differ in how they link the possible choices to the possible states of the world, not in the utilities of those states.
An agent chooses to change decision theories if their preference, calculated according to their current utility function and decision theory, is to change their decision theory and this results in them changing their decision theory. I’m not sure in how far that applies to humans. For them it may be more like realizing that TDT is a closer approximation of how their decision making process actually functions given the correct input, and that in as far as their decision making was previously approximated by another decision theory it was distorted by an oversimplified understanding of the world.
At least Eliezer’s interest in this, as I understand it, is partly motivated by his work on artificial intelligence. An AI with write-access to its own code can choose algorithms (even if it is deterministic). I don’t know how much this applies to us (although if you can brainwash or hypnotise yourself, it’s not irrelevant), but it applies to them.