It does however require infinite compute. (I’ve also gotten the impression that all approximations of Solomonoff induction are also very compute intensive, unless you allow for very loose definitions of approximation.)
Perhaps more relevant to the quoted sentence is that it takes advantage of a very cleverly non-flat prior.
Perhaps more relevant to the quoted sentence is that it takes advantage of a very cleverly non-flat prior.
Any prior over an unbounded space needs to be non-flat in this way, or you’d never be able to learn anything. To put it more precisely: if you are assigning nonzero probability to every hypothesis in the space, there will exist some description length L() below which 1- of the probability mass resides for arbitrarily small . Granted that you could have priors that are flat below some large value of L, though I think this perspective shows that they would be a bit strange/unnatural.
It does however require infinite compute. (I’ve also gotten the impression that all approximations of Solomonoff induction are also very compute intensive, unless you allow for very loose definitions of approximation.)
Perhaps more relevant to the quoted sentence is that it takes advantage of a very cleverly non-flat prior.
Any prior over an unbounded space needs to be non-flat in this way, or you’d never be able to learn anything. To put it more precisely: if you are assigning nonzero probability to every hypothesis in the space, there will exist some description length L( ) below which 1- of the probability mass resides for arbitrarily small . Granted that you could have priors that are flat below some large value of L, though I think this perspective shows that they would be a bit strange/unnatural.