Not sure. I must’ve gone crazy for a minute there, thinking something like “being able to influence 3^^^^3 people is a huge conjunction of statements, thus low probability”—but of course the universal prior doesn’t work like that. Struck that part.
It still seems relevant to me, since like in the “if my brother’s wife’s first son’s best friend flips a coin, it will fall heads” example, the prior probability actually comes from opening up the statement and looking inside, in a way that would also differentiate just fine between stubbing 3 toes and stubbing 3^3^3 toes.
Question: can we construct a low-complexity event that has universal prior much lower than is implied by its complexity, in other words it describes a relatively small set of programs, each of which has high complexity? Clearly it can’t just describe one program, but maybe with a whole set of them it’s possible. Naturally, the programs must be still hard-to-locate given the event.
See example 2 in the post. I think you can use Rice’s theorem to easily construct hypotheses with hard-to-locate predictors, but I’m not sure about the K-complexity of the resulting predictors.
K-complexity of the program defined by that criterion is about as low as that of the criterion, I’m afraid, so example 2 is invalid (“complexity” that is not K-complexity shouldn’t be relevant). The universal prior for that theory is not astronomically low.
Edit: This is wrong, in particular because the criterion doesn’t present an algorithm for finding the program, and because the program must by definition have high K-complexity.
Um, what? Can you exhibit a low-complexity algorithm that predicts sensory inputs in accordance with the theory from example 2? That’s what it would mean for the universal prior to not be low. Or am I missing something?
Not sure. I must’ve gone crazy for a minute there, thinking something like “being able to influence 3^^^^3 people is a huge conjunction of statements, thus low probability”—but of course the universal prior doesn’t work like that. Struck that part.
It still seems relevant to me, since like in the “if my brother’s wife’s first son’s best friend flips a coin, it will fall heads” example, the prior probability actually comes from opening up the statement and looking inside, in a way that would also differentiate just fine between stubbing 3 toes and stubbing 3^3^3 toes.
Question: can we construct a low-complexity event that has universal prior much lower than is implied by its complexity, in other words it describes a relatively small set of programs, each of which has high complexity? Clearly it can’t just describe one program, but maybe with a whole set of them it’s possible. Naturally, the programs must be still hard-to-locate given the event.
See example 2 in the post. I think you can use Rice’s theorem to easily construct hypotheses with hard-to-locate predictors, but I’m not sure about the K-complexity of the resulting predictors.
K-complexity of the program defined by that criterion is about as low as that of the criterion, I’m afraid, so example 2 is invalid (“complexity” that is not K-complexity shouldn’t be relevant). The universal prior for that theory is not astronomically low.
Edit: This is wrong, in particular because the criterion doesn’t present an algorithm for finding the program, and because the program must by definition have high K-complexity.
Um, what? Can you exhibit a low-complexity algorithm that predicts sensory inputs in accordance with the theory from example 2? That’s what it would mean for the universal prior to not be low. Or am I missing something?
You are right, see updated comment.
Yes, forgot about that. So, just crossing the meta-levels once is enough to create a gap in complexity, even if the event only has one element.