Matthew Barnett comments on What are some non-purely-sampling ways to do deep RL?