Very interesting. Naturalizing feedback (as opposed to directly accessing True Reward) seems like it could lead to a lot of desirable emergent behaviors, though I’m somewhat nervous about reliance on a handwritten model of what reliable feedback is.
Very interesting. Naturalizing feedback (as opposed to directly accessing True Reward) seems like it could lead to a lot of desirable emergent behaviors, though I’m somewhat nervous about reliance on a handwritten model of what reliable feedback is.