sanxiyn comments on ryan_greenblatt’s Shortform

sanxiyn 29 Jul 2025 2:17 UTC
3 points
0
I think IMO results were driven by general purpose advances, but I agree I can’t conclusively prove it because we don’t know details. Hopefully we will learn more as time goes by.
An informal argument: I think currently agentic software engineering is blocked on context rot, among other things. I expect IMO systems to have improved on this, since IMO time control is 1.5 hours per problem.
- ryan_greenblatt 29 Jul 2025 2:34 UTC
  2 points
  0
  Parent
  (I’m skeptical that much of the IMO improvement was due to improving how well AIs can use their context in general. This isn’t a crux for my view, but it also seems pretty likely that the AIs didn’t do more than ~100k serial tokens of reasoning for the IMO while still aggregating over many such reasoning traces.)