lilkim2025 comments on AI benchmarking has a Y-axis problem