Yes! If you’re a proper Bayesian, using the speed prior on sequence prediction for infinite sequences, you end up with surprisingly good loss bounds. This is surprising because the speed prior assigns 0 probability to infinite sequences, so the truth has no prior support.
If you use a maximum a posteriori estimate, instead of a full Bayesian mixture, and the truth has prior support, you also do fine.
But as far as I can tell, things break if you try both at once. So what I needed was a way of penalizing slow world-models, while still making sure that true environment had prior support (and in particular, the possibility of the true environment running for infinitely many timesteps). Otherwise, you don’t get any sort of intelligence result.
Yes! If you’re a proper Bayesian, using the speed prior on sequence prediction for infinite sequences, you end up with surprisingly good loss bounds. This is surprising because the speed prior assigns 0 probability to infinite sequences, so the truth has no prior support.
If you use a maximum a posteriori estimate, instead of a full Bayesian mixture, and the truth has prior support, you also do fine.
But as far as I can tell, things break if you try both at once. So what I needed was a way of penalizing slow world-models, while still making sure that true environment had prior support (and in particular, the possibility of the true environment running for infinitely many timesteps). Otherwise, you don’t get any sort of intelligence result.