Attempted abstraction and generalization: If we don’t know what the ideal UTM is, we can start with some arbitrary UTM U1, and use it to predict the world for a while. After (we think) we’ve gotten most of our prediction mistakes out of the way, we can then look at our current posterior, and ask which other UTM U2 might have updated to that posterior faster, using less bits of observation about (our universe/the string we’re predicting). You could think of this as a way to define what the ‘correct’ UTM is. But I don’t find that definition very satisfying, because the validity of this procedure for finding a good U2 depends on how correct the posterior we’ve converged on with our previous, arbitrary, U1 is. ‘The best UTM is the one that figures out the right answer the fastest’ is true, but not very useful.
Is the thermodynamics angle gaining us any more than that for defining the ‘correct’ choice of UTM?
We used some general reasoning procedures to figure out some laws of physics and stuff about our universe. Now we’re basically asking what other general reasoning procedures might figure out stuff about our universe as fast or faster, conditional on our current understanding of our universe being correct.
I think that’s roughly correct, but it is useful...
‘The best UTM is the one that figures out the right answer the fastest’ is true, but not very useful.
Another way to frame it would be: after one has figured out the laws of physics, a good-for-these-laws-of-physics Turning machine is useful for various other things, including thermodynamics. ‘The best UTM is the one that figures out the right answer the fastest’ isn’t very useful for figuring out physics in the first place, but most of the value of understanding physics comes after it’s figured out (as we can see from regular practice today).
Also, we can make partial updates along the way. If e.g. we learn that physics is probably local but haven’t understood all of it yet, then we know that we probably want a local machine for our theory. If we e.g. learn that physics is causally acyclic, then we probably don’t want a machine with access to atomic unbounded fixed-point solvers. Etc.
I agree that this seems maybe useful for some things, but not for the “Which UTM?” question in the context of debates about Solomonoff induction specifically, and I think that’s the “Which UTM?” question we are actually kind of philosophically confused about. I don’t think we are philosophically confused about which UTM to use in the context of us already knowing some physics and wanting to incorporate that knowledge into the UTM pick, we’re confused about how to pick if we don’t have any information at all yet.
I think roughly speaking the answer is: whichever UTM you’ve been given. I aim to write a more precise answer in an upcoming paper specifically about Solomonoff induction. The gist of it is that the idea of a “better UTM” U_2 is about as absurd as that of a UTM that has hardcoded knowledge of the future: yes such UTMs exists, but there is no way to obtain it without first looking at the data, and the best way to update on data is already given by Solomonoff induction.
Attempted abstraction and generalization: If we don’t know what the ideal UTM is, we can start with some arbitrary UTM U1, and use it to predict the world for a while. After (we think) we’ve gotten most of our prediction mistakes out of the way, we can then look at our current posterior, and ask which other UTM U2 might have updated to that posterior faster, using less bits of observation about (our universe/the string we’re predicting). You could think of this as a way to define what the ‘correct’ UTM is. But I don’t find that definition very satisfying, because the validity of this procedure for finding a good U2 depends on how correct the posterior we’ve converged on with our previous, arbitrary, U1 is. ‘The best UTM is the one that figures out the right answer the fastest’ is true, but not very useful.
Is the thermodynamics angle gaining us any more than that for defining the ‘correct’ choice of UTM?
We used some general reasoning procedures to figure out some laws of physics and stuff about our universe. Now we’re basically asking what other general reasoning procedures might figure out stuff about our universe as fast or faster, conditional on our current understanding of our universe being correct.
I think that’s roughly correct, but it is useful...
Another way to frame it would be: after one has figured out the laws of physics, a good-for-these-laws-of-physics Turning machine is useful for various other things, including thermodynamics. ‘The best UTM is the one that figures out the right answer the fastest’ isn’t very useful for figuring out physics in the first place, but most of the value of understanding physics comes after it’s figured out (as we can see from regular practice today).
Also, we can make partial updates along the way. If e.g. we learn that physics is probably local but haven’t understood all of it yet, then we know that we probably want a local machine for our theory. If we e.g. learn that physics is causally acyclic, then we probably don’t want a machine with access to atomic unbounded fixed-point solvers. Etc.
I agree that this seems maybe useful for some things, but not for the “Which UTM?” question in the context of debates about Solomonoff induction specifically, and I think that’s the “Which UTM?” question we are actually kind of philosophically confused about. I don’t think we are philosophically confused about which UTM to use in the context of us already knowing some physics and wanting to incorporate that knowledge into the UTM pick, we’re confused about how to pick if we don’t have any information at all yet.
I think roughly speaking the answer is: whichever UTM you’ve been given. I aim to write a more precise answer in an upcoming paper specifically about Solomonoff induction. The gist of it is that the idea of a “better UTM” U_2 is about as absurd as that of a UTM that has hardcoded knowledge of the future: yes such UTMs exists, but there is no way to obtain it without first looking at the data, and the best way to update on data is already given by Solomonoff induction.