I think your general conclusion is correct: “does subscription to the universal prior imply assigning probability 2^-m to any hypothesis of length/complexity m? .. ahh no”. But perhaps obviously so?
The key is the universal prior deals with programs that are complete predictors of an entire sequence of observations. They are not just simple statements.
If you want to compare to simple statements, you need to quantify their predictive power. The universal prior is a secondary classification, a way of ranking a set of algorithms/hypotheses that all are 100% perfectly accurate and specific. Its like a secondary weighting procedure you use when you have many equally perfect choices given your current data. ( given my understanding refreshed by your description)
That clearly can’t apply to statements in general, because statements in general do not in general perfectly predict the whole sequence.
For #1, what are A,B,C,D supposed to be? Complete world predictor programs?
Simple statements? If A&B&C&D is a complete predictor, and so is A|B|C|D, then the universal prior will not choose either: it will choose the simplest of A,B,C,D.
Otherwise, if A&B&C&D is required for a full predictor, then A|B|C|D can not be a full predictor. There is no case then where A&B&C&D has less predictive power than A|B|C|D. The former is more specific and thus has strictly more predictive power, but of course yes is less intrinsically probable.
I think your general conclusion is correct: “does subscription to the universal prior imply assigning probability 2^-m to any hypothesis of length/complexity m? .. ahh no”. But perhaps obviously so?
The key is the universal prior deals with programs that are complete predictors of an entire sequence of observations. They are not just simple statements.
If you want to compare to simple statements, you need to quantify their predictive power. The universal prior is a secondary classification, a way of ranking a set of algorithms/hypotheses that all are 100% perfectly accurate and specific. Its like a secondary weighting procedure you use when you have many equally perfect choices given your current data. ( given my understanding refreshed by your description)
That clearly can’t apply to statements in general, because statements in general do not in general perfectly predict the whole sequence.
For #1, what are A,B,C,D supposed to be? Complete world predictor programs?
Simple statements? If A&B&C&D is a complete predictor, and so is A|B|C|D, then the universal prior will not choose either: it will choose the simplest of A,B,C,D.
Otherwise, if A&B&C&D is required for a full predictor, then A|B|C|D can not be a full predictor. There is no case then where A&B&C&D has less predictive power than A|B|C|D. The former is more specific and thus has strictly more predictive power, but of course yes is less intrinsically probable.