This post comes from a theoretical perspective that may be alien to ML researchers; in particular, it makes an argument that simplicity priors do not solve the problem pointed out here, where simplicity is based on Kolmogorov complexity (which is an instantiation of the Minimum Description Length principle). The analog in machine learning would be an argument that regularization would not work.
Out of curiosity, is there an intuitive explanation as to why these are different? Is it mainly because ambitious value learning inevitably has to deal with lots of (systematic) mistakes in the data, whereas normally you’d make sure that the training data doesn’t contain (many) obvious mistakes? Or are there examples in ML where you can retroactively correct mistakes imported from a flawed training set?
(I’m not sure “training set” is the right word for the IRL context. Applied to ambitious value learning, what I mean would be the “human policy”.)
Update: Ah, it seems like the next post is all about this! :) My point about errors seems like it might be vaguely related, but the explanation in the next post feels more satisfying. It’s a different kind of problem because you’re not actually interested in predicting observable phenomena anymore, but instead are trying to infer the “latent variable” – the underlying principle(?) behind the inputs. The next post in the sequence also gives me a better sense of why people say that ML is typically “shallow” or “surface-level reasoning”.
Out of curiosity, is there an intuitive explanation as to why these are different? Is it mainly because ambitious value learning inevitably has to deal with lots of (systematic) mistakes in the data, whereas normally you’d make sure that the training data doesn’t contain (many) obvious mistakes? Or are there examples in ML where you can retroactively correct mistakes imported from a flawed training set?
(I’m not sure “training set” is the right word for the IRL context. Applied to ambitious value learning, what I mean would be the “human policy”.)
Update: Ah, it seems like the next post is all about this! :) My point about errors seems like it might be vaguely related, but the explanation in the next post feels more satisfying. It’s a different kind of problem because you’re not actually interested in predicting observable phenomena anymore, but instead are trying to infer the “latent variable” – the underlying principle(?) behind the inputs. The next post in the sequence also gives me a better sense of why people say that ML is typically “shallow” or “surface-level reasoning”.