I'm studying the use of a discriminator in imitation learning,
because I want to find out how to help humans produce demonstrations that
the agent can imitate,
in order to help my reader understand how we might use imitation
learning to solve the reward engineering problem.
rmoehn comments on Which of these five AI alignment research projects ideas are no good?