Ah, yeah, I agree with your story.
Before the data comes in, the conspiracy theorist may not have a lot of predictions, or may have a lot of wrong predictions.
After the data comes in, though, the conspiracy theorist will have all sorts of stories about why the data fits perfectly with their theory.
My intention in what you quote was to consider the conspiracy theory in its fulness, after it’s been all fleshed out. This is usually the version of conspiracy theories I see.
That second version of the theory will be very likely, but have a very low prior probability. And when someone finds a conspiracy theory like that convincing, part of what’s going on may be that they confuse likelihood and probability. “It all makes sense! All the details fit!”
Whereas the original conspiracy theorist is making a very different kind of mistake.
Thanks for writing this!
Just to be pedantic, I wanted to mention: if we take Fractional Kelly as the average-with-market-beliefs thing, it’s actually full Kelly in terms of our final probability estimate, having updated on the market :)
Concerning your first argument, that uncertainty leads to fractional Kelly—is the idea:
We have a probability estimate ^p, which comes from estimating the true frequency p,
Our uncertainty follows a Beta distribution,
We have to commit to a fractional Kelly strategy based on our ^p and never update that strategy ever again
?
So the graph shows what happens if we take our uncertainty and keep it as-is, not updating on data, as we continue to update?
Or is it that we keep updating (and hence reduce our uncertainty), but nonetheless, keep our Kelly fraction fixed (so we don’t converge to full Kelly even as we become increasingly certain)?
Also, I don’t understand the graph. (The third graph in your post.) You say that it shows growth rate vs Kelly fraction. Yet it’s labeled “expected utility”. I don’t know what “expected utility” means, since the expected utility should grow unboundedly as we increase the number of iterations.
Or maybe the graph is of a single step of Kelly investment, showing expected log returns? But then wouldn’t Kelly be optimal, given that Kelly maximizes log-wealth in expectation, and in this scenario the estimate ^p is going to be right on average, when we sample from the prior?
Anyway, I’m puzzled about this one. What exactly is the take-away? Let’s say a more-or-less Bayesian person (with uncertainty about their utilities and probabilities) buys the various arguments for Kelly, so they say, “In practice, my utility is more or less logarithmic in cash, at least in so far as it pertains to situations where I have repeated opportunities to invest/bet”.
Then they read your argument about parameter uncertainty.
BAYESIAN: Wait, what? I agree I’ll have parameter uncertainty. But we’ve already established that my utility is roughly logarithmic in money. My point estimate (my posterior) for this gamble paying off is p. The optimal bet under these assumptions is Kelly. So what are you saying? Perhaps you’re arguing that my best-estimate probability isn’t really p.
OTHER: No, p is really your best-estimate probability. I’m pointing to your model uncertainty.
BAYESIAN: Perhaps you’re saying that my utility isn’t really logarithmic? That I should be more risk-averse in this situation?
OTHER: No, my argument doesn’t involve anything like that.
BAYESIAN: So what am I missing? Log utility, probability p, therefore Kelly.
OTHER: Look, one of the ways we can argue for Kelly is by studying the iterated investment game, right? We look at the behavior of different strategies in the long term in that game. And we intuitively find that strategies which don’t maximize growth (EG the expected-money-maximizer) look pretty dumb. So we conclude that our values must me closer to the growth-maximizer, ie Kelly, strategy.
BAYESIAN: Right; that’s part of what convinced me that my values must be roughly logarithmic in money.
OTHER: So all I’m trying to do is examine the same game. But this time, rather than assuming we know the frequency of success from the beginning, I’m assuming we’re uncertain about that frequency.
BAYESIAN: Right… look, when I accepted the original Kelly argument, I wasn’t really imagining this circumstance where we face the exact same bet over and over. Rather, I was imagining I face lots of different situations. So long as my probabilities are calibrated, the long-run frequency argument works out the same way. Kelly looks optimal. So what’s your beef with me going “full Kelly” on those estimates?
OTHER: In those terms, I’m examining the case where probabilities aren’t calibrated.
BAYESIAN: That’s not so hard to fix, though. I can make a calibration graph of my long-term performance. I can try to adjust my probability estimates based on that. If my 70% probability events tend to come back true 60% of the time, I adjust for that in the future. I’ve done this. You’ve done this.
OTHER: Do you really think your estimates are calibrated, now?
BAYESIAN: Not precisely, but I could put more work into it if I wanted to. Is this your crux? Would you be happy for me to go Full Kelly if I could show you a perfect x=y line on my calibration graph? Are you saying you can calculate the α value for my fractional Kelly strategy from my calibration graph?
OTHER: … maybe? I’d have to think about how to do the calculation. But look, even if you’re perfectly calibrated in terms of past data, you might be caught off guard by a sudden change in the state of affairs.
BAYESIAN: Hm. So let’s grant that there’s uncertainty in my calibration graph. Are you saying it’s not my current point-estimate of my calibration that matters, but rather, my uncertainty about my calibration?
OTHER: I fear we’re getting overly meta. I do think α should be lower the more uncertain you are about your calibration you are, in addition to lower the lower your point-estimate calibration is. But let’s get a bit more concrete. Look at the graph. I’m showing that you can expect better returns with lower α in this scenario. Is that not compelling?
BAYESIAN (who at this point regresses to just being Abram again): See, that’s my problem. I don’t understand the graph. I’m kind of stuck thinking that it represents someone with their hands tied behind their back, like they can’t perform a Bayes update to improve their estimate ^p, or they can’t change their α after the start, or something.