I agree with your two problems, but the problem with your alternative and so many others presented here is that it doesn’t so strongly speak to the distinction which EY means to draw, between wanting to be seen to have followed the forms for maximising expected utility and actually seeking to maximise expected utility.
Also, of course, one who at each moment makes the decision that maximises expected future utility defects against Clippy in both Prisoner’s Dilemma and Parfit’s Hitchhiker scenarios, and arguably two-boxes against Omega, and by EY’s definition that counts as “not winning” because of the negative consequences of Clippy/Omega knowing that that’s what we do.
Re: “it doesn’t so strongly speak to the distinction which EY means to draw”
I wasn’t trying to do that. It seems like a non-trivial concept. Is it important to try and capture that distinction in a slogan?
Re: “one who at each moment makes the decision that maximises expected future utility defects”
Expected utility maximising agents don’t have commitment mechanisms, and can’t be trusted to make promises? I am sceptical. In my view, you can express practically any agent as an expected utility maximiser. It seems easy enough to imagine commitment mechanisms. I don’t see where the problem is.
You can’t claim commitment mechanisms are not possible when in fact they evidently are. “Always cooperate” is an example of a strategy which is committed to cooperate in the prisoner’s dilemma.
“Commitment mechanism” typically means some way to impose a cost on a party for breaking the commitment, otherwise it is, in the game theorist’s parlance, “cheap talk” instead. In the one-shot PD, there is by definition no commitment mechanism, and it was in this LCPW that Eliezer’s decision theories are frequently tested.
You’re talking about the repeated PD with “always cooperate,” rather than the one-shot version, which was the scenario in which we found ourselves with Clippy. Please understand—I’m not saying EU-maxing agents do not have commitment mechanisms in general, just that the PD was formulated expressly to show the breakdown of cooperation under certain circumstances.
Regardless, always cooperate definitely does not maximize expected utility in the vast majority of environments. Indeed, it is not part of literally any stable equilibrium in a finite-time RPD. But more to the point, AC is only “committed” in the sense that, if given no opportunities afterward to make decisions, it will appear to produce committed behavior. It is unstable precisely because it requires no further decision points, where the RPD (in which it is played) has them every round.
You and I are using different definitions of “commitment mechanism”, then.
The idea I am talking about is demonstrating to the other party that you are a nice, cooperative agent. For example by showing the other agent your source code. That concept has nothing to do with crime and punishment.
The type of commitment mechanism I am talking about is one that convincingly demonstrates that you are committed to a particular course of action under some specified circumstances. That includes commitment via threat of retribution—but also includes some other things.
AC’s stability is tangential to my point. If you want to complain that AC is unstable, perhaps consider TFT instead. That is exactly the same as AC on the first round.
Re: “it doesn’t so strongly speak to the distinction which EY means to draw”
I wasn’t trying to do that. It seems like a non-trivial concept. Is it important to try capture that idea in a slogan?
Re: “one who at each moment makes the decision that maximises expected future utility defects”
Expected utility maximising agents don’t have commitment mechanisms, and can’t be trusted to make promises? I am sceptical. In my view, you can express practically any agent as an expected utility maximiser. It seems easy enough to imagine commitment mechanisms. I don’t see where the problem is.
Also, of course, one who at each moment makes the decision that maximises expected future utility defects against Clippy in both Prisoner’s Dilemma and Parfit’s Hitchhiker scenarios, and arguably two-boxes against Omega, and by EY’s definition that counts as “not winning” because of the negative consequences of Clippy/Omega knowing that that’s what we do.
I think I’m misunderstanding you here because this looks like a contradiction. Why does making the decision that maximizes expected utility necessarily have negative consequences? It sounds like you’re working under a decision theory that involves preference reversals.
I’m talking about the difference between CDT, which stiffs the lift-giver in Parfit’s Hitchhiker and so never gets a lift, and other decision theories.
As contagious memes go, “rationalists should win” seems to be rather pathogenic to me. A proposed rationalist slogan shouldn’t need so many footnotes. For the sake of minds everywhere, I think it would be best to try to kill it off in its early stages.
I much prefer “rationalists should win” because it’s simple, accessible language. Makes this article more powerful than it would otherwise be. Everyone gets winning; how many people find terms like expected utility maximisation meaningful on a gut level?
Re: “First, foremost, fundamentally, above all else: Rational agents should WIN.”
In an attempt to summarise the objections, there seem to be two fairly-fundamental problems:
Rational agents try. They cannot necessarily win: winning is an outcome, not an action;
“Winning” is a poor synonym for “increasing utility”: sometimes agents should minimise their losses.
“Rationalists maximise expected utility” would be a less controversial formulation.
I agree with your two problems, but the problem with your alternative and so many others presented here is that it doesn’t so strongly speak to the distinction which EY means to draw, between wanting to be seen to have followed the forms for maximising expected utility and actually seeking to maximise expected utility.
Also, of course, one who at each moment makes the decision that maximises expected future utility defects against Clippy in both Prisoner’s Dilemma and Parfit’s Hitchhiker scenarios, and arguably two-boxes against Omega, and by EY’s definition that counts as “not winning” because of the negative consequences of Clippy/Omega knowing that that’s what we do.
Re: “it doesn’t so strongly speak to the distinction which EY means to draw”
I wasn’t trying to do that. It seems like a non-trivial concept. Is it important to try and capture that distinction in a slogan?
Re: “one who at each moment makes the decision that maximises expected future utility defects”
Expected utility maximising agents don’t have commitment mechanisms, and can’t be trusted to make promises? I am sceptical. In my view, you can express practically any agent as an expected utility maximiser. It seems easy enough to imagine commitment mechanisms. I don’t see where the problem is.
In the Least Convenient Possible World, I imagine nobody has a commitment mechanism in the Prisoner’s Dilemma.
You can’t claim commitment mechanisms are not possible when in fact they evidently are. “Always cooperate” is an example of a strategy which is committed to cooperate in the prisoner’s dilemma.
“Commitment mechanism” typically means some way to impose a cost on a party for breaking the commitment, otherwise it is, in the game theorist’s parlance, “cheap talk” instead. In the one-shot PD, there is by definition no commitment mechanism, and it was in this LCPW that Eliezer’s decision theories are frequently tested.
You’re talking about the repeated PD with “always cooperate,” rather than the one-shot version, which was the scenario in which we found ourselves with Clippy. Please understand—I’m not saying EU-maxing agents do not have commitment mechanisms in general, just that the PD was formulated expressly to show the breakdown of cooperation under certain circumstances.
Regardless, always cooperate definitely does not maximize expected utility in the vast majority of environments. Indeed, it is not part of literally any stable equilibrium in a finite-time RPD. But more to the point, AC is only “committed” in the sense that, if given no opportunities afterward to make decisions, it will appear to produce committed behavior. It is unstable precisely because it requires no further decision points, where the RPD (in which it is played) has them every round.
You and I are using different definitions of “commitment mechanism”, then.
The idea I am talking about is demonstrating to the other party that you are a nice, cooperative agent. For example by showing the other agent your source code. That concept has nothing to do with crime and punishment.
The type of commitment mechanism I am talking about is one that convincingly demonstrates that you are committed to a particular course of action under some specified circumstances. That includes commitment via threat of retribution—but also includes some other things.
AC’s stability is tangential to my point. If you want to complain that AC is unstable, perhaps consider TFT instead. That is exactly the same as AC on the first round.
Re: “it doesn’t so strongly speak to the distinction which EY means to draw”
I wasn’t trying to do that. It seems like a non-trivial concept. Is it important to try capture that idea in a slogan?
Re: “one who at each moment makes the decision that maximises expected future utility defects”
Expected utility maximising agents don’t have commitment mechanisms, and can’t be trusted to make promises? I am sceptical. In my view, you can express practically any agent as an expected utility maximiser. It seems easy enough to imagine commitment mechanisms. I don’t see where the problem is.
I think I’m misunderstanding you here because this looks like a contradiction. Why does making the decision that maximizes expected utility necessarily have negative consequences? It sounds like you’re working under a decision theory that involves preference reversals.
I’m talking about the difference between CDT, which stiffs the lift-giver in Parfit’s Hitchhiker and so never gets a lift, and other decision theories.
Oh, I see. I thought you were saying an optimal decision theory stiffed the lift-giver.
I hope I’ve become clearer in the four years since I wrote that!
. . . did not notice the date-stamp. Good thing thread necros are allowed here.
But, alas, less catchy.
As contagious memes go, “rationalists should win” seems to be rather pathogenic to me. A proposed rationalist slogan shouldn’t need so many footnotes. For the sake of minds everywhere, I think it would be best to try to kill it off in its early stages.
I much prefer “rationalists should win” because it’s simple, accessible language. Makes this article more powerful than it would otherwise be. Everyone gets winning; how many people find terms like expected utility maximisation meaningful on a gut level?
Charlie Sheen rationality.