Blunt Razor

In this post, I would like to share a simple observation of mine which is about deriving some crude form of Occam’s razor from a small set of assumptions.

Assumptions:

Subjective Bayesian probability is used to model an agent’s degree of belief in hypotheses.
The set of hypotheses $H$ that we assign beliefs is countably infinite.
Complexity of a hypothesis is represented with a positive integer where bigger integers represent more complex hypotheses.
For any $h \in H$ there exists finitely many hypotheses in $H$ that have the same complexity as $h$ .

When we accept these assumptions it is easy to prove the following statement.

For any $h \in H$ there are only finitely many hypotheses in $H$ that are more complex and more probable or less complex and less probable than $h$ .

Proof.

The set of all hypotheses which are more probable than $h$ is finite, since otherwise probabilities cannot sum to 1. The set of all hypotheses that are simpler than $h$ is also finite by assumptions 3 and 4. Intersection of a finite set with another set is a finite set, and union of two finite sets is a finite set. $□$

Note that this result is independent from the choice of function that assigns beliefs to hypotheses. In addition, while it assumes some properties about the notion of simplicity, it does not require a full formalization of it.

Edit 1:

Assumption 3 can be thought of as two distinct assumptions combined together:

The one-sided infinite nature of complexity, that is, there is always a more complex hypothesis while there exists “the simplest level”.
The finite gradation of complexity, that is, for any hypothesis $h$ , there exists a hypothesis $h^{'}$ such that there is no hypothesis which is more complex than $h$ and less complex than $h^{'}$ .

As a remark, the notion of Kolmogorov complexity (the length of the shortest computer program that generates an object) satisfies these properties.

Edit 2:

I no longer think this is a valuable observation, since it seems to be more about the shortcomings of mathematical tools we use than something fundamental.

The sentence “The set of all hypotheses which are more probable than $h$ is finite” is true only when we insist on using a complete event space (powerset of sample space) and a probability measure assigning real numbers to events. The proof is no longer valid if we introduce infinitesimals for instance.