Stuart_Armstrong comments on Reward/value learning for reinforcement learning

Stuart_Armstrong 1 Dec 2016 9:42 UTC
LW: 2 AF: 2
0
AF
Rewards and POMDP rather than utility and general environments.

This formalism adds nothing (it’s designed for its intended audience, but all these formalisms are pretty similar), it’s just posted here for the next posts, which will use it.