Jozdien comments on Caution when interpreting Deepmind’s In-context RL paper

Jozdien 1 Nov 2022 13:25 UTC
7 points
1
I’m confused, because I wasn’t that surprised on reading the paper. My take was that generative models are not-terrible at simulating agents to do some kind of task, which can include ones that require stuff we might call optimization. That would imply that modelling a low-fidelity RL algorithm isn’t really beyond its simulation capabilities. Independent of whether the paper actually did show models learning good RL algorithms, it feels like if they had, I wouldn’t take it as much evidence to update my priors one way or the other.