It’s been a while since I reviewed Ole Peters, but I stand by what I said—by his own admission, the game he is playing is looking for ergodic observables. An ergodic observable is defined as a quantity such that the expectation is constant across time, and the time-average converges (with probability one) to this average.
The ergodic observable in the case of kelly-like situations is the ratio of wealth from one round to the next.
The concern I wrote about in this post is that it seems a bit ad-hoc to rummage around until we find an ergodic observable to maximize. I’m not sure how concerning this critique should really be. I still think Ole Peters has done something great, namely, articulate a real Frequentist alternative to Bayesian decision theory.
It incorporates classic Frequentist ideas: you have to interpret individual experiments as part of an infinite sequence in order for probabilities and expectations to be meaningful; and, the relevant probabilities/expectations have to converge.
So it similarly inherits the same problems: how do you interpret one decision problem as part of an infinite sequence where expectations converge?
If you want my more detailed take written around the time I was reading up on these things, see here. Note that I make a long comment underneath my long comment where I revise some of my opinions.
You say that “We can’t time-average our profits [...] So we look at the ratio of our money from one round to the next.” But that’s not what Peters does! He looks at maximizing total wealth, in the limit as time goes to infinity.
In particular, we want to maximize U=limT→∞U0∗∏Tt=0Rt where U is wealth after all the bets and Rt is 1 plus the percent-increase from bet t.
Taken literally, this doesn’t make mathematical sense, because the wealth does not necessarily converge to anything (indeed, it does not, so long as the amount risked in investment does not go to zero).
Since this intuitive idea doesn’t make literal mathematical sense, we then have to do some interpretation. You jump from the ill-defined maximization of a limit to this:
You want to know what choice to make for any given decision, so you want to maximize your rate of return for each individual bet, which is (∏Tt=0Rt)1T.
But this is precisely the ad-hoc decision I am worried about! Choosing to maximize rate of return (rather than, say, simple return) is tantamount to choosing to maximize log money instead of money!
So the argument can only be as strong as this step—how well can we justify the selection of rate of return (IE, 1 + percentage increase in wealth, IE, the ratio of wealth from one round to the next)?
Ole Peters’ answer for this is his theory of ergodic observables. You know that you’ve found the observable to maximize when it is ergodic (for your chosen infinite-sequence version of the decision problem).
One worry I have is that choice of ergodic observables may not be unique. I don’t have an example where there are multiple choices, but I also haven’t seen Ole Peters prove uniqueness. (But maybe I’ve read too shallowly.)
Another worry I have is that there may be no ergodic observable.
Another worry I have is that there will be many ways to interpret a decision problem as part of an infinite sequence of decision problems (akin to the classic reference class problem). How do you integrate these together?
I’m not claiming any of these worries are decisive.
It’s been a while since I reviewed Ole Peters, but I stand by what I said—by his own admission, the game he is playing is looking for ergodic observables. An ergodic observable is defined as a quantity such that the expectation is constant across time, and the time-average converges (with probability one) to this average.
This is very clear in, EG, this paper.
The ergodic observable in the case of kelly-like situations is the ratio of wealth from one round to the next.
The concern I wrote about in this post is that it seems a bit ad-hoc to rummage around until we find an ergodic observable to maximize. I’m not sure how concerning this critique should really be. I still think Ole Peters has done something great, namely, articulate a real Frequentist alternative to Bayesian decision theory.
It incorporates classic Frequentist ideas: you have to interpret individual experiments as part of an infinite sequence in order for probabilities and expectations to be meaningful; and, the relevant probabilities/expectations have to converge.
So it similarly inherits the same problems: how do you interpret one decision problem as part of an infinite sequence where expectations converge?
If you want my more detailed take written around the time I was reading up on these things, see here. Note that I make a long comment underneath my long comment where I revise some of my opinions.
Taken literally, this doesn’t make mathematical sense, because the wealth does not necessarily converge to anything (indeed, it does not, so long as the amount risked in investment does not go to zero).
Since this intuitive idea doesn’t make literal mathematical sense, we then have to do some interpretation. You jump from the ill-defined maximization of a limit to this:
But this is precisely the ad-hoc decision I am worried about! Choosing to maximize rate of return (rather than, say, simple return) is tantamount to choosing to maximize log money instead of money!
So the argument can only be as strong as this step—how well can we justify the selection of rate of return (IE, 1 + percentage increase in wealth, IE, the ratio of wealth from one round to the next)?
Ole Peters’ answer for this is his theory of ergodic observables. You know that you’ve found the observable to maximize when it is ergodic (for your chosen infinite-sequence version of the decision problem).
One worry I have is that choice of ergodic observables may not be unique. I don’t have an example where there are multiple choices, but I also haven’t seen Ole Peters prove uniqueness. (But maybe I’ve read too shallowly.)
Another worry I have is that there may be no ergodic observable.
Another worry I have is that there will be many ways to interpret a decision problem as part of an infinite sequence of decision problems (akin to the classic reference class problem). How do you integrate these together?
I’m not claiming any of these worries are decisive.