Aaron Staley comments on tdko’s Shortform

Aaron Staley 7 Aug 2025 21:59 UTC
5 points
1
I don’t believe there’s a strong correlation between mathematical ability and agentic coding tasks (as opposed to competition coding tasks where a stronger correlation exists).
1. Gemini 2.5 Pro is already was well ahead of O3 on IMO, but had worse swe-bench/METR scores.
2. Claude is relatively bad at math but has hovered around SOTA on agentic coding.