Seth Herd comments on OpenAI Claims IMO Gold Medal

Seth Herd 19 Jul 2025 22:03 UTC
2 points
−4
Yes, I suspect this is the root of the issue. There are strong economic incentives to optimize for shorter sequences that produce correct answers. It’s great that this hasn’t harmed legibility of the chain of thought yet, but this pressure will likely create use of jargon that could quickly become a human-uneeadable CoT. I see this as one of the main dangers for effectively faithful CoT. And most of the reasonable hopes for aligning LLM-based AGI that I can see route through faithful CoT.

There’s still the possibility that a fresh version of the same model will understand and be happy to correct a interpret the CoT if it’s become a unique language of thought. But that’s a lot shakier than CoT can be read by any model.
- Leon Lang 19 Jul 2025 23:05 UTC
  8 points
  8
  Parent
  As I understand it, we don’t actually see the chain of thought here but only the final submitted solution. And I don’t think that a pressure to save tokens would apply to that.
  - Daniel Kokotajlo 20 Jul 2025 1:41 UTC
    11 points
    6
    Parent
    Well, maybe there’s some transfer? Maybe habits picked up from the CoT die hard & haven’t been trained away with RLHF yet?
    - Thane Ruthenis 20 Jul 2025 10:59 UTC
      3 points
      0
      Parent
      I’d guess it has something to do with whatever they’re using to automatically evaluate the performance in “hard-to-verify domains”. My understanding is that, during training, those entire proofs would have been the final outputs which the reward function (or whatever) would have taken in and mapped to training signals. So their shape is precisely what the training loop optimized – and if so, this shape is downstream of some peculiarities on that end, the training loop preferring/enforcing this output format.
  - ryan_greenblatt 21 Jul 2025 17:39 UTC
    4 points
    0
    Parent
    If the AI is iterating on solutions, there is actually pressure to reduce the length of draft/candidate solutions. Then, it might be that OpenAI didn’t implement a clean up pass on the final solution (even though there wouldn’t be any real pressure to save tokens in the final clean up).
  - Thomas Dybdahl Ahle 20 Jul 2025 21:37 UTC
    1 point
    0
    Parent
    I think pressure on the final submitted solution is likely. That will encourage more insightful proofs over long monotonous case studies.