jessicata comments on Reward/value learning for reinforcement learning

jessicata 1 Dec 2016 4:15 UTC
0 points
0
AF
What are the main differences from the formalism in this paper?
- Stuart_Armstrong 1 Dec 2016 9:42 UTC
  LW: 2 AF: 2
  0
  AF Parent
  Rewards and POMDP rather than utility and general environments.
  
  This formalism adds nothing (it’s designed for its intended audience, but all these formalisms are pretty similar), it’s just posted here for the next posts, which will use it.