zulupineapple comments on Biased reward-learning in CIRL