For example, an RL agent that learns a policy that looks good to humans but isn’t. Adversarial examples that only fool a neural nets wouldn’t count.
For example, an RL agent that learns a policy that looks good to humans but isn’t. Adversarial examples that only fool a neural nets wouldn’t count.