That’s an excellent point. Of course one cannot introduce RL without talking about the reward signal, and I’ve never intended to.
To me, however, the defining feature of RL is the structure of the solution space, described in this post. To you, it’s the existence of a reward signal. I’m not sure that debating this difference of opinion is the best use of our time at this point. I do hope to share my reasons in future posts, if only because they should be interesting in themselves.
As for your last point: RL is indeed a very general setting, and classical planning can easily be formulated in RL terms.
That’s an excellent point. Of course one cannot introduce RL without talking about the reward signal, and I’ve never intended to.
To me, however, the defining feature of RL is the structure of the solution space, described in this post. To you, it’s the existence of a reward signal. I’m not sure that debating this difference of opinion is the best use of our time at this point. I do hope to share my reasons in future posts, if only because they should be interesting in themselves.
As for your last point: RL is indeed a very general setting, and classical planning can easily be formulated in RL terms.