TurnTrout comments on Intuitive examples of reward function learning?

TurnTrout 6 Mar 2018 17:25 UTC
3 points
0
I think the constraint-based problems are more intuitive. As someone who thinks about this regularly, the classical examples had an abstract, alignment-theoretic texture, while the constraint-based ones seemed more relatable to something I’d actually be doing on a daily basis.
The specific constraint-based example chosen would be dependent on the audience. If all your readers are familiar with the process of completing literature reviews, go for that—otherwise, the CEO problem seems most natural.