royf comments on Reinforcement Learning: A Non-Standard Introduction (Part 2)