I think we all agree that some form of Occam’s razor makes sense
Yes.
and that it can become a quantifiable criterion if you pose it in terms of “how many bits does it take to specify this hypothesis as a predictive algorithm?”
That’s a much stronger claim:
It’s unclear whether this is the proper way of formalizing Occam’s razor.
Even if it was, it is an uncomputable approach, and, as far as I know, there aren’t even reasonably methods to approximate it in the general case.
It’s uncomputable if you want to compare all possible hypotheses according to this criterion, but not if you just want to distinguish between hypotheses that are already available and fully specified. Also, it doesn’t have to be the way to formalize the razor, to have some validity.
not if you just want to distinguish between hypotheses that are already available and fully specified.
This assumes that you can computably map each hypothesis to a single program (or a set of programs where you can computably identify the shortest element).
For arbitrary hypotheses, this is impossible due to the Rice’s theorem.
If you write your hypotheses as prepositions in a formal language, then by restricting the language you can get decidability at the expense of expressive power. The typical examples are the static type systems of many popular programming languages (though notably not C++).
Even then, you run into complexity issues: for instance, type checking is NP-complete in C#, and I conjecture that the problem of finding the shortest C# program that satisfies some type-checkable property is NP-hard. (the obvious brute-force way of doing it is to enumerate programs in order of increasing length and check them until you find one that passes).
Yes.
That’s a much stronger claim:
It’s unclear whether this is the proper way of formalizing Occam’s razor.
Even if it was, it is an uncomputable approach, and, as far as I know, there aren’t even reasonably methods to approximate it in the general case.
It’s uncomputable if you want to compare all possible hypotheses according to this criterion, but not if you just want to distinguish between hypotheses that are already available and fully specified. Also, it doesn’t have to be the way to formalize the razor, to have some validity.
This assumes that you can computably map each hypothesis to a single program (or a set of programs where you can computably identify the shortest element).
For arbitrary hypotheses, this is impossible due to the Rice’s theorem.
If you write your hypotheses as prepositions in a formal language, then by restricting the language you can get decidability at the expense of expressive power. The typical examples are the static type systems of many popular programming languages (though notably not C++).
Even then, you run into complexity issues: for instance, type checking is NP-complete in C#, and I conjecture that the problem of finding the shortest C# program that satisfies some type-checkable property is NP-hard. (the obvious brute-force way of doing it is to enumerate programs in order of increasing length and check them until you find one that passes).