As explained in the consequentialism post, we’ve handicapped TDT by giving our agents shortsighted utility functions.
This is a perfect illustration of the ‘consequentialism isn’t nearsighted’ moral but shortsightedness just isn’t a complete answer here. Sure, telling the agents to MAX(number of descendants in the long term) is sufficient to give them a combination of input->behaviour pairs that will make them win the metagame but it isn’t their only mistake and giving it exclusive emphasis distracts somewhat from the general problem.
From the perspective of the meta-game the utility function given to the TDTs is not just shortsighted, it is also naive. That is, when we come to “Problem: ” we are not really looking at the absolute number of descendants the TDTs had. We’re looking at the ratio TDTs : !TDTs. Given two different outcomes, one in which agents dominated the population and produced X offspring and another which produced X+1 offspring but ended up a minority then the reasoning we have done here with Exercises 2 to 4 and the Problem would call the X+1 outcome the ‘loser’ even though it had more success and even if that may well be the best it could possibly do (according to MAX(descendants)) in certain instances of these ‘chaotic’ situations!
The not-naive utility function is simply “maximise the proportion of copies of yourself in the next generation”.
It so happens that in the specific meta-game we’re considering we only have to give the TDTs a utility function that is either not shortsighted or not naive. They will both happen to win this specific overall meta game because they take the same actions. But there are simple variants of the game that require that naivety and shortsightedness are both eliminated, neither hack being sufficient alone. We should focus on the underlying problem: Lost Purpose—any difference between the utility function given to the agent and what it actually means to ‘win’.
This is a perfect illustration of the ‘consequentialism isn’t nearsighted’ moral but shortsightedness just isn’t a complete answer here. Sure, telling the agents to MAX(number of descendants in the long term) is sufficient to give them a combination of input->behaviour pairs that will make them win the metagame but it isn’t their only mistake and giving it exclusive emphasis distracts somewhat from the general problem.
From the perspective of the meta-game the utility function given to the TDTs is not just shortsighted, it is also naive. That is, when we come to “Problem: ” we are not really looking at the absolute number of descendants the TDTs had. We’re looking at the ratio TDTs : !TDTs. Given two different outcomes, one in which agents dominated the population and produced X offspring and another which produced X+1 offspring but ended up a minority then the reasoning we have done here with Exercises 2 to 4 and the Problem would call the X+1 outcome the ‘loser’ even though it had more success and even if that may well be the best it could possibly do (according to MAX(descendants)) in certain instances of these ‘chaotic’ situations!
The not-naive utility function is simply “maximise the proportion of copies of yourself in the next generation”.
It so happens that in the specific meta-game we’re considering we only have to give the TDTs a utility function that is either not shortsighted or not naive. They will both happen to win this specific overall meta game because they take the same actions. But there are simple variants of the game that require that naivety and shortsightedness are both eliminated, neither hack being sufficient alone. We should focus on the underlying problem: Lost Purpose—any difference between the utility function given to the agent and what it actually means to ‘win’.