Minor naming feedback. You switched from calling something “supervised learning” to “reinforcement learning”. The first images that come to my mind when I hear “reinforcement learning” are TD-Gammon and reward signals. So, when I read “reinforcement learning”, I first think of a computer getting smarter through iterative navel-gazing, then think of a computer trying to wirehead itself, then stumble to the meaning I think you intend. I am a lay reader.
Minor naming feedback. You switched from calling something “supervised learning” to “reinforcement learning”. The first images that come to my mind when I hear “reinforcement learning” are TD-Gammon and reward signals. So, when I read “reinforcement learning”, I first think of a computer getting smarter through iterative navel-gazing, then think of a computer trying to wirehead itself, then stumble to the meaning I think you intend. I am a lay reader.