aphyer comments on A brief perspective from an IMO coordinator

aphyer 24 Jul 2025 15:20 UTC
4 points
0
I looked at the Q1/4/5 answers^[1]. I think they would indeed most likely all get 7s: there’s quite a bit of verbosity, and in particular OpenAI’s Q4 answer spends a lot of time talking its way around in circles, but I believe there’s a valid proof in all of them.
Most interesting is Q1, where OpenAI produces what I think is a very human answer (the same approach I took, and the one I’d expect most human solvers to take) while Google takes a less intuitive approach but one that ends up much neater. This makes me a little bit suspicious about whether some functionally-identical problem showed up somewhere in Google’s training, but if it didn’t that is extra impressive.
1. ^
  IMO Q3 and Q6 are generally much harder: the AI didn’t solve Q6, and I haven’t gone through the Q3 answers. Q2 was a geometry one, which is weirder to look through and which I find very unpleasant.
- gjm 24 Jul 2025 17:16 UTC
  2 points
  0
  Parent
  I’m not convinced that the Google approach to Q1 actually ends up significantly neater than the more obvious one (which was mine too, and like you I’d expect most human solvers to do it that way), but I do think the lemma they prove is a very nice one!
  - aphyer 24 Jul 2025 17:43 UTC
    2 points
    0
    Parent
    I think the obvious approach is comparably neat until you get to the point of proving
    that k=2 won’t work
    at which point it’s a mess. The Google approach manages to prove that part in a much nicer way as a side effect of its general result.
    - gjm 24 Jul 2025 19:48 UTC
      2 points
      0
      Parent
      The obvious approach lets you
      reduce to the case of a size-3 triangle
      at which point
      there are WLOG only three options for your single non-sunny line, and you can pretty much draw the diagrams and say “behold!” but even writing out longhand why you can’t cover what’s left in any of those cases with two sunny lines isn’t too painful.