To summarise, I interpret TAG as saying something like “when SI assigns a probability of x to a program P, what does that mean; how can we cash that out in terms of reality? And Vaniver is saying “It means that, if you sum up the probabilities assigned to all programs which implement roughly the same function, then you get the probability that this function is ‘the underlying program of reality’”.
I think there are three key issues with this response (if I’ve understood it correctly):
It is skipping all the hard work of figuring out which functions are roughly the same. This is a difficult unsolved (and maybe unsolveable?) problem, which is, for example, holding back progress on FDT.
It doesn’t actually address the key problem of epistemology. We’re in a world, and we’d like to know lots of things about it. Solomonoff induction, instead of giving us lots of knowledge about the world, gives us a massive Turing machine which computes the quantum wavefunction, or something, and then outputs predictions for future outputs. For example, let’s say that previous inputs were the things I’ve seen in the past, and the predictions are of what I’ll see in the future. But those predictions might tell us very few interesting things about the world! For example, they probably won’t help me derive general relativity. In some sense the massive Turing machine contains the fact that the world runs on general relativity, but accessing that fact from the Turing machine might be even harder than accessing it by studying the world directly. (Relatedly, see Deutsch’s argument (which I quote above) that even having a predictive oracle doesn’t “solve” science.)
There’s no general way to apply SI to answer a bounded question with a sensible bounded answer. Hence, when you say “you can make your stable of hypotheses infinitely large”, this is misleading: programs aren’t hypotheses, or explanations, in the normal sense of the word, for almost all of the questions we’d like to understand.
I think the way you expressed issue 3 makes it too much of a clone of issue 1; if I tell you the bounds for the question in terms of programs, then I think there is a general way to apply SI to get a sensible bounded answer. If I tell you the bounds in terms of functions, then there would be a general way to incorporate that info into SI, if you knew how to move between functions and programs.
The way I think about those issues that (I think?) separates them more cleanly is that we both have to figure out the ‘compression’ problem of how to consider ‘models’ as families of programs (at some level of abstraction, at least) and the ‘elaboration’ problem of how to repopulate our stable of candidates when we rule out too many of the existing ones. SI bypasses the first and gives a trivial answer to the second, but a realistic intelligence will have interesting answers to both.
There’s no general way to apply SI to answer a bounded question with a sensible bounded answer. Hence, when you say “you can make your stable of hypotheses infinitely large”, this is misleading: programs aren’t hypotheses, or explanations, in the normal sense of the word, for almost all of the questions we’d like to understand
And it’s also unclear, to say the least , that the criterion that an SI uses to prefer and discard hypotheses/programmes actually is a probability, despite being labelled as such.
To summarise, I interpret TAG as saying something like “when SI assigns a probability of x to a program P, what does that mean; how can we cash that out in terms of reality? And Vaniver is saying “It means that, if you sum up the probabilities assigned to all programs which implement roughly the same function, then you get the probability that this function is ‘the underlying program of reality’”.
I think there are three key issues with this response (if I’ve understood it correctly):
It is skipping all the hard work of figuring out which functions are roughly the same. This is a difficult unsolved (and maybe unsolveable?) problem, which is, for example, holding back progress on FDT.
It doesn’t actually address the key problem of epistemology. We’re in a world, and we’d like to know lots of things about it. Solomonoff induction, instead of giving us lots of knowledge about the world, gives us a massive Turing machine which computes the quantum wavefunction, or something, and then outputs predictions for future outputs. For example, let’s say that previous inputs were the things I’ve seen in the past, and the predictions are of what I’ll see in the future. But those predictions might tell us very few interesting things about the world! For example, they probably won’t help me derive general relativity. In some sense the massive Turing machine contains the fact that the world runs on general relativity, but accessing that fact from the Turing machine might be even harder than accessing it by studying the world directly. (Relatedly, see Deutsch’s argument (which I quote above) that even having a predictive oracle doesn’t “solve” science.)
There’s no general way to apply SI to answer a bounded question with a sensible bounded answer. Hence, when you say “you can make your stable of hypotheses infinitely large”, this is misleading: programs aren’t hypotheses, or explanations, in the normal sense of the word, for almost all of the questions we’d like to understand.
I agree with those issues.
I think the way you expressed issue 3 makes it too much of a clone of issue 1; if I tell you the bounds for the question in terms of programs, then I think there is a general way to apply SI to get a sensible bounded answer. If I tell you the bounds in terms of functions, then there would be a general way to incorporate that info into SI, if you knew how to move between functions and programs.
The way I think about those issues that (I think?) separates them more cleanly is that we both have to figure out the ‘compression’ problem of how to consider ‘models’ as families of programs (at some level of abstraction, at least) and the ‘elaboration’ problem of how to repopulate our stable of candidates when we rule out too many of the existing ones. SI bypasses the first and gives a trivial answer to the second, but a realistic intelligence will have interesting answers to both.
And it’s also unclear, to say the least , that the criterion that an SI uses to prefer and discard hypotheses/programmes actually is a probability, despite being labelled as such.