Anomalous comments on Measuring artificial intelligence on human benchmarks is naive