Could someone explain why this doesn’t degenerate into an entirely circular concept when we postulate a stronger compiler; or why it doesn’t become entirely dependent on the choice of the compiler?
There are many programs that output identical sequences. That’s a waste. Make it so that no two different programs have the same output.
There are many sequences that when fed into the compiler don’t result in valid programs. That’s a waste. Make it so that every binary sequence represents a valid program.
Now we have a set of sequences that we’d like to encode: S = {ε, 0, 1, 00, 01, … }, a set of sequences that are interpreted by the compiler as programs: P = {ε, 0, 1, 00, 01, … } and the compiler which is a bijection from P to S. It better not turn out to be the identity function.. And that’s with the best possible compiler. If we postulate a reasonable but much weaker compiler then the programs that encode the sequences become on average longer than the sequences themselves!
The only way out of this that I see is to weight elements of S by their frequencies in our universe and/or by how much we care about them, and then let the compiler be a function that minimizes this frequency-importance score. In fact, this compiler starts looking more and more like an encoder (?!). The difficult part then seems to me to be the choice of the optimal encoder, and not the Solomonoff induction itself.
Edit: Of course, when there’s a 1 to 1 mapping, then selecting the shortest program is trivial. So in a way, if we make the Solomonoff induction trivial then the only thing that’s left is the choice of the compiler. But why isn’t this still a problem with weaker, traditional compilers?
Could someone explain why this doesn’t degenerate into an entirely circular concept when we postulate a stronger compiler; or why it doesn’t become entirely dependent on the choice of the compiler?
There are many programs that output identical sequences. That’s a waste. Make it so that no two different programs have the same output.
There are many sequences that when fed into the compiler don’t result in valid programs. That’s a waste. Make it so that every binary sequence represents a valid program.
Now we have a set of sequences that we’d like to encode: S = {ε, 0, 1, 00, 01, … }, a set of sequences that are interpreted by the compiler as programs: P = {ε, 0, 1, 00, 01, … } and the compiler which is a bijection from P to S. It better not turn out to be the identity function.. And that’s with the best possible compiler. If we postulate a reasonable but much weaker compiler then the programs that encode the sequences become on average longer than the sequences themselves!
The only way out of this that I see is to weight elements of S by their frequencies in our universe and/or by how much we care about them, and then let the compiler be a function that minimizes this frequency-importance score. In fact, this compiler starts looking more and more like an encoder (?!). The difficult part then seems to me to be the choice of the optimal encoder, and not the Solomonoff induction itself.
Edit: Of course, when there’s a 1 to 1 mapping, then selecting the shortest program is trivial. So in a way, if we make the Solomonoff induction trivial then the only thing that’s left is the choice of the compiler. But why isn’t this still a problem with weaker, traditional compilers?