Steven Byrnes comments on Why almost every RL agent does learned optimization

Steven Byrnes 21 Feb 2023 17:56 UTC
LW: 2 AF: 2
0
AF
Thanks!
One of the justifications for my hunch is to gesture at the Bitter Lesson and to guess that a learned planning algorithm could potentially be a lot better than a planning algorithm we hard code into a system.
See Section 3 here for why I think it would be a lot worse.