E.g. TurnTrout has done a lot of self-learning from textbooks and probably has better advice [for learning RL]
I have been summoned! I’ve read a few RL textbooks… unfortunately, they’re either a) very boring, b) very old, or c) very superficial. I’ve read:
Reinforcement Learning by Sutton & Barto (my book review)
Nice book for learning the basics. Best textbook I’ve read for RL, but that’s not saying much.
Superficial, not comprehensive, somewhat outdated circa 2018; a good chunk was focused on older techniques I never/rarely read about again, like SARSA and exponential feature decay for credit assignment. The closest I remember them getting to DRL was when they discussed the challenges faced by function approximators.
AI: A Modern Approach 3e by Russell & Norvig (my book review)
Engaging and clear, but most of the book wasn’t about RL. Outdated, but 4e is out now and maybe it’s better.
Markov Decision Processes by Puterman
Thorough, theoretical, very old, and very boring. Formal and dry. It was written decades ago, so obviously no mention of Deep RL.
Neuro-Dynamic Programming by Tsitsiklis
When I was a wee second-year grad student, I was independently recommended this book by several senior researchers. Apparently it’s a classic. It’s very dry and was written in 1996. Pass.
OpenAI’s several-page web tutorial Spinning Up with Deep RL is somehow the most useful beginning RL material I’ve seen, outside of actually taking a class. Kinda sad.
So when I ask my brain things like “how do I know about bandits?”, the result isn’t “because I read it in {textbook #23}”, but rather “because I worked on different tree search variants my first summer of grad school” or “because I took a class”. I think most of my RL knowledge has come from:
My own theoretical RL research
the fastest way for me to figure out a chunk of relevant MDP theory is often just to derive it myself
Watercooler chats with other grad students
Sorry to say that I don’t have clear pointers to good material.
I do share your opinion on the Sutton and Barto, which is the only book I read from your list (except a bit of the Russell and Norvig, but not the RL chapter). Notably, I took a lot of time to study the action value methods, only to realise later that a lot of recent work focus instead of policy-gradient methods (even if actor critics do use action-values).
From your answer and Rohin’s, I gather that we lack a good resource in Deep RL, at least of the kind useful for AI Safety researchers. It makes me even more curious of the kind of knowledge that would be treated in such a resource.
I have been summoned! I’ve read a few RL textbooks… unfortunately, they’re either a) very boring, b) very old, or c) very superficial. I’ve read:
Reinforcement Learning by Sutton & Barto (my book review)
Nice book for learning the basics. Best textbook I’ve read for RL, but that’s not saying much.
Superficial, not comprehensive, somewhat outdated circa 2018; a good chunk was focused on older techniques I never/rarely read about again, like SARSA and exponential feature decay for credit assignment. The closest I remember them getting to DRL was when they discussed the challenges faced by function approximators.
AI: A Modern Approach 3e by Russell & Norvig (my book review)
Engaging and clear, but most of the book wasn’t about RL. Outdated, but 4e is out now and maybe it’s better.
Markov Decision Processes by Puterman
Thorough, theoretical, very old, and very boring. Formal and dry. It was written decades ago, so obviously no mention of Deep RL.
Neuro-Dynamic Programming by Tsitsiklis
When I was a wee second-year grad student, I was independently recommended this book by several senior researchers. Apparently it’s a classic. It’s very dry and was written in 1996. Pass.
OpenAI’s several-page web tutorial Spinning Up with Deep RL is somehow the most useful beginning RL material I’ve seen, outside of actually taking a class. Kinda sad.
So when I ask my brain things like “how do I know about bandits?”, the result isn’t “because I read it in {textbook #23}”, but rather “because I worked on different tree search variants my first summer of grad school” or “because I took a class”. I think most of my RL knowledge has come from:
My own theoretical RL research
the fastest way for me to figure out a chunk of relevant MDP theory is often just to derive it myself
Watercooler chats with other grad students
Sorry to say that I don’t have clear pointers to good material.
Thanks for the in-depth answer!
I do share your opinion on the Sutton and Barto, which is the only book I read from your list (except a bit of the Russell and Norvig, but not the RL chapter). Notably, I took a lot of time to study the action value methods, only to realise later that a lot of recent work focus instead of policy-gradient methods (even if actor critics do use action-values).
From your answer and Rohin’s, I gather that we lack a good resource in Deep RL, at least of the kind useful for AI Safety researchers. It makes me even more curious of the kind of knowledge that would be treated in such a resource.