Pattern comments on [AN #64]: Using Deep RL and Reward Uncertainty to Incentivize Preference Learning