Why bet Kelly?

Suppose you know what outcomes are better than what other outcomes, but not by how much. Given two possible actions you could take, and , if you know what probability distributions over outcomes each of them results in, this doesn’t necessarily help you pick which is better. Being able to compare outcomes isn’t enough for you to be able to compare probability distributions over outcomes.

But what if I told you that the worst outcome that could possibly occur if you pick is better than the best outcome that could possibly occur if you pick ? Now the choice is very easy. There are no trade-offs, and you don’t need to do a risk analysis. is just better.

Now what if it’s not quite true that every possible outcome of is better than every possible outcome of , but it’s almost true, in the sense that there’s some threshold such that the probability that is better than that threshold is almost 1, and the probability that is worse than that threshold is almost 0? For instance, let’s say the 0.01%ile outcome if you pick is better than the 99.99%ile outcome if you pick . Then it’s not quite as clear-cut. Maybe most of the time is only slightly better than , but the tiny possibility of being worse than usual involves being enormously worse than usual, or the tiny possibility of being better than usual involves being enormously better than usual. But this is still a pretty big hint that is better in expectation. is better unless your expected utility calculations are radically altered by negligible-probability tail events.

When to act like your utility is linear

Let’s suppose from now on that the outcomes you’re comparing are how much money you have. Of course, how rich you are is far from a perfect metric for measuring the extent to which you’re achieving your goals, but that’s okay; money is still a thing that people generally want more of, and we can assume that everything important that’s not downstream of wealth is being held constant.

Suppose you’re given a long sequence of opportunities to make bets. On each step, you can choose among some set of options for what bet to place, and there will be a maximum stakes you’ll be able to bet at each time, because whoever or whatever you’re betting against will only accept small bets. Assume that only the stakes that will be accepted, and not your own financial resources, constrain what bets you can take (perhaps you have unlimited access to credit, or perhaps the stakes are low enough relative to your starting wealth that you are vanishingly unlikely to go broke), and there is no tendency for the stakes that will be accepted to change over time.

In this case, the optimal strategy is to pick whatever option maximizes the expected value of your wealth on each step. This is because of the central limit theorem: After a large number of steps, the total wealth gained will, with very high probability, be very close to the number of steps times the expected profit per step, so in the long run, whatever strategy maximizes expected profit per step wins. Specifically, if the optimal strategy gets you expected profit per step, and you instead opt for a strategy that gets you expected profit per step, then, after steps, you lose out on profit on average, with a standard deviation proportional to . Thus the number of standard deviations away from average it takes for your alternative strategy to do better is proportional to . The probability of this approaches as .

When to act like your utility is logarithmic (the Kelly criterion)

Now suppose you’re given a long sequence of opportunities to make bets, where there is no limit to how much you can bet at each step except that you can’t put at risk more money than you have (you have no access to credit). Now the range of betting opportunities available to you at each step is proportional to your wealth. So your choices for probability distributions over what factor your wealth gets multiplied by remains constant when your wealth changes. That is, your choices for probability distributions over what gets added to your log wealth remains constant when your wealth changes.

So, on a log scale, we’re in the situation described in the previous section, where we’re adding the results of a large number of gambles, where the gambles available don’t depend on time or your current wealth. Thus making whatever bet maximizes your expected log wealth on each step is optimal when the number of steps is large, for the same reasons as in the previous section.

Some unimportant caveats

In each of the scenarios above, when comparing the optimal strategy to any given other strategy, the optimal strategy gets the better result with probability approaching , but that probability never actually equals after any finite number of steps. So it is possible for negligible-probability tail events to affect which strategy is actually better in expectation.

For example, if your utility function is linear, and you’re given the opportunity to bet all your money on a gamble of positive expected value, you keep doing it over and over again, this seeming better than Kelly betting because, although you almost always end up broke, the expected value of going all in every time is enormous because of the negligible-probability tail event where you win every bet. Trying to take this expected value calculation seriously becomes increasingly difficult as the number of bets increases and the probability of winning all of them goes to , because you don’t actually get linear utility from money.

After a moderate number of steps, when the probability of these tail events is merely small rather than negligible, it’s not crazy to be moved by them; there are unlikely events that are nonetheless worth planning around because they are tremendously more important than other much more likely events that we also care about to a nontrivial degree. But as the number of steps becomes very large and the probability of the alternative strategy ending up better becomes vanishingly small, letting negligible-probability tail events jerk you around becomes increasingly ridiculous; no one actually pays Pascal’s mugger. So I think the assumption that negligible-probability tail events don’t have a large effect on expected value calculations is reasonable. This is the case, for instance, if your utility function is bounded, and the derivative of utility with respect to money is non-negligible at likely outcomes of the optimal strategy.

Another issue is that, under the assumptions of either of the previous sections, while for any given alternative strategy, the asymptotically optimal strategy will eventually be better in expectation than the alternative strategy (according to a utility function that doesn’t get jerked around by negligible-probability tail events), it is not the case that there is a sufficient number of steps after which the asymptotically optimal strategy is the best in expectation among all strategies (unless your utility function actually is linear, or logarithmic, respectively, like the asymptotically optimal strategy acts like it is on each step). I don’t think this is important because as the number of steps approaches infinity, the optimal-in-expectation strategy will approach the asymptotically optimal strategy, so if the number of steps is large, you can just follow the asymptotically optimal strategy and not worry too much about the negligible amounts of expected value you lose by not slightly adjusting your strategy.

Some important caveats

The two models presented above have assumptions, and assumptions don’t always hold in every real-world situation. Maybe the number of steps you have to make decisions on just isn’t that large, leaving more room for your action utility function to be important. Maybe the constraints you face on what stakes you can take bets at don’t match the assumptions in either of the above models. Or maybe you’re a dumb human instead of a perfect Bayesian, and your expected profit calculations are systematically biased, and you won’t update fast enough for this not to cause problems. Because of these sorts of considerations, fractional Kelly betting is often recommended in situations where the Kelly criterion might look like it applies.