Christopher King comments on Solomonoff induction still works if the universe is uncomputable, and its usefulness doesn’t require knowing Occam’s razor

Christopher King 19 Jun 2023 17:38 UTC
0 points
0
Larger errors literally take more bits to describe. For example, in binary, 3 is 11₂ and 10 is 1010₂ (twice the bits).

Say that you have two hypotheses, A and B, such that A is 100 bits more complicated than B but 5% closer to the true value. This means for each sample, the error in B on average takes log₂(1.05) = 0.07 bits more to describe than the error in A.

After about 1,430 samples, A and B will be considered equally likely. After about 95 more samples, A will be considered 100 times more likely than B.

In general, if f(x) is some high level summary of important information in x, Solomonoff induction that only tries to predict x is also universal for predicting f(x) (and it even has the same or better upper-bounds).