Maximizing expected utility can be paradoxically shown to minimize actual utility, however. Consider a game in which you place an initial bet of $1 on a 6-sided die coming up anything but 1 (2-6), which pays even money if you win and costs you your bet if you lose. The twist, however, is that upon winning (i.e. you now have $2 in front of you) you must either bet the entire sum formed by your bet and its wins or leave the game permanently. Theoretically, since the odds are in your favor, you should always keep going. Always. But wait, this means you will eventually lose it all. Even if you say “just one more and I’ll stop”, it’ll be mathematically optimal to keep repeating this behavior. This “optimal” strategy does worse than any arbitrary random strategy possible.
You aren’t analyzing this game correctly. At the beginning of the game, you’re deciding between possible strategies for playing the game, and you should be evaluating the expected value of each of these strategies.
The strategy where you keep going until you lose has expected value −1. There is also a sequence of strategies depending on a positive integer n where you quit at the latest after the nth bet, and their expected values form an arithmetic progression. In other words, there isn’t an optimal strategy for this game because there are infinitely many strategies and their expected values get arbitrarily high.
In addition, the sequence of strategies I described tends to the first strategy in the limit as n tends to infinity, in some sense, but their expected values don’t respect this limit, which is what leads to the apparent paradox that you noted. In more mathematical language, what you’re seeing here is a failure of the ability to exchange limits and integrals (where the integrals are expected values). Less mathematically, you can’t evaluate the expected value of a sequence of infinitely many decisions by adding up the expected value of each individual decision. In practice, you will never be able to make infinitely many decisions, so this doesn’t really matter.
This issue is closely related to the puzzle where the Devil gives you money and takes it away infinitely many times. I don’t remember what it’s called.
Indeed they don’t, but the point is that while stopping at N+1 always dominates stopping at N, this thinking leads one to keep continuing and lose. As such, the only winning move is to do exactly NOT this and decide some arbitrary prior point to stop at (or decide indeterministically such as by coin flip). Attempting to maximize expected utility is the only strategy that won’t work. This game, prisoners’ dilemma, and newcomblike problems are all cases where choosing in such a way that does better (than the alternative) in all cases can still do worse overall.
The point isn’t that the strategy that is supposed to maximize expected utility is a bad idea. The point is that you’re computing its expected utility incorrectly because you’re switching a limit and an integral that you can’t switch. This is a completely different issue from the prisoner’s dilemma; it is entirely an issue of infinities and has nothing to do with the practical issue of being a decision-maker with bounded resources making finitely many decisions.
It isn’t a matter of switching a limit and an integral, or any means of infinity really. You could just consider the 1 number you’re currently on, your options are to continue or stop. To come out of the game with any money, one must at some point say “forget maximizing expected utility, I’m not risking losing what I’ve acquired”. By stopping, you lose expected utility compared to continuing exactly 1 more time. My point being that it is not always the case that “you must maximize expected utility”, for in some cases it may be wrong or impossible to do so.
All you’ve shown is that maximizing expected utility infinitely many times does not maximize the expected utility you get at the end of the infinitely many decisions you’ve made. This is entirely a matter of switching a limit and an integral, and it is irrelevant to practical decision-making.
1 This argument only works if the bet is denominated in utils rather than in dollars. Otherwise, someone who gets diminishing marginal utility from dollars for very large sums—that would include most people—will eventually decide to stop. (If I have utility = log(dollars) and initial assets of $1M then I will stop after 25 wins, if I did the calculations right.)
1a It is not at all clear that a bet denominated in utils is even actually possible. Especially not one which, with high probability, ends up involving an astronomically large quantity of utility.
2 Even someone who doesn’t generally get diminishing marginal utility from dollars—say, an altruist who will use all those dollars for saving other people’s lives, and who cares equally about all—will find marginal utility decreasing for large enough sums, because (a) eventually the cheap problems are solved and saving the next life starts costing more, and (b) if you give me 10^15 dollars and I try to spend it all (on myself or others) then the resulting inflation will make them worth less.
3 Given that “you will eventually lose it all”, a strategy of continuing to bet does not in fact maximize expected utility.
4 The expected utility from a given choice at a given stage in the game depends on what you’d then do with the remainder of the game. For instance, if I know that my future strategy after winning this roll is going to be “keep betting for ever” then I know that my expected utility if I keep playing is zero, so I’ll choose not to do that.
5 So at most what we have (even if we assume we’ve dealt somehow with issues of diminishing marginal utility etc.) is a game where there’s an infinite “increasing” sequence of strategies but no limiting strategy that’s better than all of them. But that’s no surprise. Here’s another game with the same property: You name a positive integer N and Omega gives you $N. For any fixed N, it is best not to choose N because larger numbers are better. “Therefore” you can’t name any particular number, so you refuse to play and get nothing. If you don’t find this paradoxical—and I confess that I don’t—then I don’t think you need find the die-rolling game any worse. (Choosing N in this game <--> deciding to play for N turns in the die-rolling game.)
[EDITED to stop the LW software turning my numbered points into differently numbered and weirdly formatted points.]
[EDITED again to acknowledge that after writing all that I read on and found that others had already said more or less the same things as me. D’oh. Anyway, since apparently Qiaochu_Yuan wasn’t successful in convincing srn247, perhaps my slightly different presentation will be of some help.]
It isn’t really very much like the St. Petersburg paradox. The St. Petersburg game runs for a random length of time, you don’t choose whether to continue; the only choice you make is at the beginning of the game where you decide how much to pay.
Is it just me or is this essentially the same as the Lifespan Dilemma?
At the very least, in both cases, you find that you get high expected utilities by choosing very low probabilities of getting anything at all.
If your preferences can always be modelled with a utility function, does that mean that no matter how you make decisions, there’s some adaptation of this paradox that will lead you to accept a near certainty of death?
It is essentially that, and it does show that trying to maximize expected utility can lead to such negative outcomes. Unfortunately, there doesn’t seem to be a simple alternative to maximizing expected utility that doesn’t lead to being a money pump. The kelly criterion is an excellent example of a decision-making strategy that doesn’t maximize expected utility but still wins compared to it, so at least it’s known that it can be done.
Maximizing expected utility can be paradoxically shown to minimize actual utility, however. Consider a game in which you place an initial bet of $1 on a 6-sided die coming up anything but 1 (2-6), which pays even money if you win and costs you your bet if you lose. The twist, however, is that upon winning (i.e. you now have $2 in front of you) you must either bet the entire sum formed by your bet and its wins or leave the game permanently. Theoretically, since the odds are in your favor, you should always keep going. Always. But wait, this means you will eventually lose it all. Even if you say “just one more and I’ll stop”, it’ll be mathematically optimal to keep repeating this behavior. This “optimal” strategy does worse than any arbitrary random strategy possible.
You aren’t analyzing this game correctly. At the beginning of the game, you’re deciding between possible strategies for playing the game, and you should be evaluating the expected value of each of these strategies.
The strategy where you keep going until you lose has expected value −1. There is also a sequence of strategies depending on a positive integer n where you quit at the latest after the nth bet, and their expected values form an arithmetic progression. In other words, there isn’t an optimal strategy for this game because there are infinitely many strategies and their expected values get arbitrarily high.
In addition, the sequence of strategies I described tends to the first strategy in the limit as n tends to infinity, in some sense, but their expected values don’t respect this limit, which is what leads to the apparent paradox that you noted. In more mathematical language, what you’re seeing here is a failure of the ability to exchange limits and integrals (where the integrals are expected values). Less mathematically, you can’t evaluate the expected value of a sequence of infinitely many decisions by adding up the expected value of each individual decision. In practice, you will never be able to make infinitely many decisions, so this doesn’t really matter.
This issue is closely related to the puzzle where the Devil gives you money and takes it away infinitely many times. I don’t remember what it’s called.
Indeed they don’t, but the point is that while stopping at N+1 always dominates stopping at N, this thinking leads one to keep continuing and lose. As such, the only winning move is to do exactly NOT this and decide some arbitrary prior point to stop at (or decide indeterministically such as by coin flip). Attempting to maximize expected utility is the only strategy that won’t work. This game, prisoners’ dilemma, and newcomblike problems are all cases where choosing in such a way that does better (than the alternative) in all cases can still do worse overall.
The point isn’t that the strategy that is supposed to maximize expected utility is a bad idea. The point is that you’re computing its expected utility incorrectly because you’re switching a limit and an integral that you can’t switch. This is a completely different issue from the prisoner’s dilemma; it is entirely an issue of infinities and has nothing to do with the practical issue of being a decision-maker with bounded resources making finitely many decisions.
It isn’t a matter of switching a limit and an integral, or any means of infinity really. You could just consider the 1 number you’re currently on, your options are to continue or stop. To come out of the game with any money, one must at some point say “forget maximizing expected utility, I’m not risking losing what I’ve acquired”. By stopping, you lose expected utility compared to continuing exactly 1 more time. My point being that it is not always the case that “you must maximize expected utility”, for in some cases it may be wrong or impossible to do so.
All you’ve shown is that maximizing expected utility infinitely many times does not maximize the expected utility you get at the end of the infinitely many decisions you’ve made. This is entirely a matter of switching a limit and an integral, and it is irrelevant to practical decision-making.
1 This argument only works if the bet is denominated in utils rather than in dollars. Otherwise, someone who gets diminishing marginal utility from dollars for very large sums—that would include most people—will eventually decide to stop. (If I have utility = log(dollars) and initial assets of $1M then I will stop after 25 wins, if I did the calculations right.)
1a It is not at all clear that a bet denominated in utils is even actually possible. Especially not one which, with high probability, ends up involving an astronomically large quantity of utility.
2 Even someone who doesn’t generally get diminishing marginal utility from dollars—say, an altruist who will use all those dollars for saving other people’s lives, and who cares equally about all—will find marginal utility decreasing for large enough sums, because (a) eventually the cheap problems are solved and saving the next life starts costing more, and (b) if you give me 10^15 dollars and I try to spend it all (on myself or others) then the resulting inflation will make them worth less.
3 Given that “you will eventually lose it all”, a strategy of continuing to bet does not in fact maximize expected utility.
4 The expected utility from a given choice at a given stage in the game depends on what you’d then do with the remainder of the game. For instance, if I know that my future strategy after winning this roll is going to be “keep betting for ever” then I know that my expected utility if I keep playing is zero, so I’ll choose not to do that.
5 So at most what we have (even if we assume we’ve dealt somehow with issues of diminishing marginal utility etc.) is a game where there’s an infinite “increasing” sequence of strategies but no limiting strategy that’s better than all of them. But that’s no surprise. Here’s another game with the same property: You name a positive integer N and Omega gives you $N. For any fixed N, it is best not to choose N because larger numbers are better. “Therefore” you can’t name any particular number, so you refuse to play and get nothing. If you don’t find this paradoxical—and I confess that I don’t—then I don’t think you need find the die-rolling game any worse. (Choosing N in this game <--> deciding to play for N turns in the die-rolling game.)
[EDITED to stop the LW software turning my numbered points into differently numbered and weirdly formatted points.]
[EDITED again to acknowledge that after writing all that I read on and found that others had already said more or less the same things as me. D’oh. Anyway, since apparently Qiaochu_Yuan wasn’t successful in convincing srn247, perhaps my slightly different presentation will be of some help.]
This is the St. Petersburg paradox, discussed here from time to time.
It isn’t really very much like the St. Petersburg paradox. The St. Petersburg game runs for a random length of time, you don’t choose whether to continue; the only choice you make is at the beginning of the game where you decide how much to pay.
Or is it equivalent in some subtle way?
Is it just me or is this essentially the same as the Lifespan Dilemma?
At the very least, in both cases, you find that you get high expected utilities by choosing very low probabilities of getting anything at all.
If your preferences can always be modelled with a utility function, does that mean that no matter how you make decisions, there’s some adaptation of this paradox that will lead you to accept a near certainty of death?
It is essentially that, and it does show that trying to maximize expected utility can lead to such negative outcomes. Unfortunately, there doesn’t seem to be a simple alternative to maximizing expected utility that doesn’t lead to being a money pump. The kelly criterion is an excellent example of a decision-making strategy that doesn’t maximize expected utility but still wins compared to it, so at least it’s known that it can be done.