William_S comments on Robustness of Model-Graded Evaluations and Automated Interpretability