Your Situation type is single-shot (and that is similar to some other comments): one action and one probability distribution. EE’s claim is specifically about sequential, multiplicative settings, which your type signature doesn’t yet express.
If you extend the formalization to a repeated game with multiplicative compounding, the answer becomes concrete:
// Repeated multiplicative gamble: // e.g. each round, 50% chance of +50%, 50% chance of −40% // Agent chooses what fraction of wealth to bet each round.
coolness =(strategy, T_rounds)→ median(wealth after T_rounds under strategy) // or: almost_sure_growth_rate(strategy) // or: probability(outperform(strategy, alternative, T))
recommendation = kelly_criterion // bet fraction that maximizes E[log(wealth_ratio)] each round
ev_maximization = always_bet_everything // because betting everything maximizes E[wealth_ratio] each round
And then coolness(recommendation) > coolness(ev_maximization) is a mathematical theorem. The expected-value maximizer bets everything every round and almost surely goes broke. The Kelly bettor bets a fraction and almost surely achieves the maximum long-run growth rate. After enough rounds, the Kelly bettor is richer than the EV maximizer with probability approaching 1.
The single metric by which EV maximization “wins” is mean wealth across a hypothetical ensemble of parallel agents, which is dominated by vanishingly unlikely astronomical outcomes that no individual agent will almost ever experience.
This is EE’s core claim: expected value is the wrong coolness function for a single agent in a multiplicative sequential environment, and the right coolness function is the time-average growth rate, which the ergodic mapping formalizes and generalizes beyond the multiplicative case.
(Thanks for your response! I’m pessimistic that this conversational subtree will lead to great insights for either of us, but am jotting down my scattered thoughts in case they’re of interest.)
It seems to me that you can represent a sequential setting as a one-shot Situation whose Action type is a function from “observations so far” (in your +50%/-40% example, List<"win"|"lose">) to “action I take in that sub-situation” (in your example, the fraction of my bankroll I bet on the next flip.
(...maybe you can’t do this transform if you violate dynamic consistency? But violating dynamic consistency seems Actually Crazy in a way that merely violating consequentialism isn’t. Something is deeply broken if you simultaneously think “I’m going to do X if Y happens” and “I know that after Y happens I won’t do X.”)
In your “fair coin, +50%/-40%” example: if I have the opportunity to play this game for 10 rounds, and my life savings are $100k, then I agree “always bet everything” seems like a bad plan, and optimizing for my median net worth at the end seems pretty reasonable.
...but if $100k is just the contents of my wallet, and I have $X in illiquid assets that can’t participate in this game… and $X is much larger than ($100k)x(1.5^10)… then optimizing for my median net worth at the end no longer seems reasonable.
My best guess at your resolution to this is something like “in the second case, 10 rounds isn’t necessarily sequential enough, you need a number of iterations that depends on X”; but me having an extra $100T at home doesn’t affect my preference ordering of outcomes, and it seems to me that a median-utility-based decision theory should be insensitive to the magnitude of [my illiquid savings] vs [my bankroll].
Your
Situationtype is single-shot (and that is similar to some other comments): one action and one probability distribution. EE’s claim is specifically about sequential, multiplicative settings, which your type signature doesn’t yet express.If you extend the formalization to a repeated game with multiplicative compounding, the answer becomes concrete:
And then
coolness(recommendation) > coolness(ev_maximization)is a mathematical theorem. The expected-value maximizer bets everything every round and almost surely goes broke. The Kelly bettor bets a fraction and almost surely achieves the maximum long-run growth rate. After enough rounds, the Kelly bettor is richer than the EV maximizer with probability approaching 1.The single metric by which EV maximization “wins” is mean wealth across a hypothetical ensemble of parallel agents, which is dominated by vanishingly unlikely astronomical outcomes that no individual agent will almost ever experience.
This is EE’s core claim: expected value is the wrong
coolnessfunction for a single agent in a multiplicative sequential environment, and the rightcoolnessfunction is the time-average growth rate, which the ergodic mapping formalizes and generalizes beyond the multiplicative case.(Thanks for your response! I’m pessimistic that this conversational subtree will lead to great insights for either of us, but am jotting down my scattered thoughts in case they’re of interest.)
It seems to me that you can represent a sequential setting as a one-shot
SituationwhoseActiontype is a function from “observations so far” (in your +50%/-40% example,List<"win"|"lose">) to “action I take in that sub-situation” (in your example, the fraction of my bankroll I bet on the next flip.(...maybe you can’t do this transform if you violate dynamic consistency? But violating dynamic consistency seems Actually Crazy in a way that merely violating consequentialism isn’t. Something is deeply broken if you simultaneously think “I’m going to do X if Y happens” and “I know that after Y happens I won’t do X.”)
In your “fair coin, +50%/-40%” example: if I have the opportunity to play this game for 10 rounds, and my life savings are $100k, then I agree “always bet everything” seems like a bad plan, and optimizing for my median net worth at the end seems pretty reasonable.
...but if $100k is just the contents of my wallet, and I have $X in illiquid assets that can’t participate in this game… and $X is much larger than ($100k)x(1.5^10)… then optimizing for my median net worth at the end no longer seems reasonable.
My best guess at your resolution to this is something like “in the second case, 10 rounds isn’t necessarily sequential enough, you need a number of iterations that depends on X”; but me having an extra $100T at home doesn’t affect my preference ordering of outcomes, and it seems to me that a median-utility-based decision theory should be insensitive to the magnitude of [my illiquid savings] vs [my bankroll].
No response required!