Rohin Shah comments on Hypothesis: gradient descent prefers general circuits