If you manually encode some general human biases in the form of a world model ontology of objects, physics, agents, and goals, then rl becomes nearly as sample-efficient as human learning (in the tested atari-style games).
I think it provides evidence because it implies that a lot of the much ballyhooed human sample-efficiency is not in the learning algorithm, but the priors. If you provide informative priors, then ordinary known learning algorithms are capable of matching human performance; which is then mutually reinforcing with the stylized fact that when we create a problem which disables humans’ informative priors but keep the problem’s algorithmic difficulty fixed, their performance suddenly stops being so impressive (implying that the human learning algorithm is similar to ordinary known learning algorithms).
Fair point. I read §1 as a more general claim than one about deep learning, so I think it supports that part directly. §2 is less clear and requires some additional argument (e.g. llms approximate bayesian inference or something along those lines).
Another ref to support the claims in §1 and §2: https://arxiv.org/abs/2107.12544
If you manually encode some general human biases in the form of a world model ontology of objects, physics, agents, and goals, then rl becomes nearly as sample-efficient as human learning (in the tested atari-style games).
The OP is about the “deep learning sample efficiency gap”. But that’s not a deep learning paper. So I don’t think it provides any evidence here.
I think it provides evidence because it implies that a lot of the much ballyhooed human sample-efficiency is not in the learning algorithm, but the priors. If you provide informative priors, then ordinary known learning algorithms are capable of matching human performance; which is then mutually reinforcing with the stylized fact that when we create a problem which disables humans’ informative priors but keep the problem’s algorithmic difficulty fixed, their performance suddenly stops being so impressive (implying that the human learning algorithm is similar to ordinary known learning algorithms).
Fair point. I read §1 as a more general claim than one about deep learning, so I think it supports that part directly. §2 is less clear and requires some additional argument (e.g. llms approximate bayesian inference or something along those lines).