Let’s imagine a Solomonoff-style inference problem, i.e. our inference-engine receives a string of bits and tries to predict the next bit.
Here’s an anti-Occamian environment for this inference problem. At timestep t, the environment takes the t bits which the inference-engine has seen, and feeds them into a Solomonoff inductor. The Solomonoff inductor does its usual thing: roughly speaking, it finds the shortest program which would output those t bits, then outputs the next out-bit of that shortest program. But our environment is anti-Occamian, so it sets bit t+1 to the opposite of the Solomonoff inductor’s output.
That’s an anti-Occamian environment. (Note that it is uncomputable, but still entirely well-defined mathematically.)
One example of an “anti-Occamian hypothesis” would be the hypothesis that the environment works as just described—i.e. at every step, the environment does the opposite of what Occam’s razor (as operationalized by a Solomonoff inductor) would predict.
An agent which believed the anti-Occamian hypothesis would, for instance, expect that any hypothesis which did a very good job predicting past data would almost-surely fail at the next timestep. Of course, in a world like ours, the agent would be wrong about this most of the time… but that would only make the agent more confident. After all, it’s always been wrong before, so (roughly speaking) on the anti-Occamian hypothesis it’s very likely to be right this time!
Again, note that this example is uncomputable but entirely well-defined mathematically. As with Solomonoff inductors, we could in principle construct limited-compute analogues.
The standard example of a system which (maybe) behaves in this anti-inductive way in the real world is a financial market.
This does bring in the Berry paradox, since after all the hypothesis that the environment follows such a rule is in fact a very short hypothesis itself!
The resolution is that despite being describable by a short rule, it does not correspond to any computable program. Computing the results of any such hypothesis (even with the Solomonoff inductor requiring a Turing oracle already) would require an oracle of the next level up.
A super-Solomonoff inductor with a higher level Turing oracle can of course compute such a result and would assign it high weight in the described environment. Then that inductor likewise has short but uncomputable hypotheses, and so on.
Does this apply at all to anything more probabilistic than just reversing the outcome of a single most likely hypothesis and the next bit(s) it outputs? An Occamian prior doesn’t just mean “this is the shortest hypothesis; therefore it is true,” it means hypotheses are weighted by their simplicity. It’s possible for an Occamian prior to think the shortest hypothesis is most likely wrong, if there are several slightly longer hypotheses that have more probability in total.
Let’s imagine a Solomonoff-style inference problem, i.e. our inference-engine receives a string of bits and tries to predict the next bit.
Here’s an anti-Occamian environment for this inference problem. At timestep t, the environment takes the t bits which the inference-engine has seen, and feeds them into a Solomonoff inductor. The Solomonoff inductor does its usual thing: roughly speaking, it finds the shortest program which would output those t bits, then outputs the next out-bit of that shortest program. But our environment is anti-Occamian, so it sets bit t+1 to the opposite of the Solomonoff inductor’s output.
That’s an anti-Occamian environment. (Note that it is uncomputable, but still entirely well-defined mathematically.)
One example of an “anti-Occamian hypothesis” would be the hypothesis that the environment works as just described—i.e. at every step, the environment does the opposite of what Occam’s razor (as operationalized by a Solomonoff inductor) would predict.
An agent which believed the anti-Occamian hypothesis would, for instance, expect that any hypothesis which did a very good job predicting past data would almost-surely fail at the next timestep. Of course, in a world like ours, the agent would be wrong about this most of the time… but that would only make the agent more confident. After all, it’s always been wrong before, so (roughly speaking) on the anti-Occamian hypothesis it’s very likely to be right this time!
Again, note that this example is uncomputable but entirely well-defined mathematically. As with Solomonoff inductors, we could in principle construct limited-compute analogues.
The standard example of a system which (maybe) behaves in this anti-inductive way in the real world is a financial market.
This does bring in the Berry paradox, since after all the hypothesis that the environment follows such a rule is in fact a very short hypothesis itself!
The resolution is that despite being describable by a short rule, it does not correspond to any computable program. Computing the results of any such hypothesis (even with the Solomonoff inductor requiring a Turing oracle already) would require an oracle of the next level up.
A super-Solomonoff inductor with a higher level Turing oracle can of course compute such a result and would assign it high weight in the described environment. Then that inductor likewise has short but uncomputable hypotheses, and so on.
Does this apply at all to anything more probabilistic than just reversing the outcome of a single most likely hypothesis and the next bit(s) it outputs? An Occamian prior doesn’t just mean “this is the shortest hypothesis; therefore it is true,” it means hypotheses are weighted by their simplicity. It’s possible for an Occamian prior to think the shortest hypothesis is most likely wrong, if there are several slightly longer hypotheses that have more probability in total.
Of course, the generalization would be that the environment inverts whatever next bit some Occamian agent thinks is most probable.