Rafael Harth comments on Benign model-free RL