williawa comments on faul_sname’s Shortform

williawa 1 Oct 2025 9:08 UTC
5 points
0
Haha, I also tested this out. I found that the same thing happened with GPT5 (with same tokens as o3). I didn’t test it so rigorously I can be confident, but might this mean GPT5-high = continued train of o3?
- cfoster0 1 Oct 2025 15:39 UTC
  6 points
  0
  Parent
  Note that many of these same weird tokens have been observed in GPT-5 chains-of-thought (at least “marinade”, “illusions”, “overshadow”).
  - Bronson Schoen 3 Oct 2025 17:37 UTC
    4 points
    −2
    Parent
    Also notable IMO that GPT-5 in the METR report is doing the new thing where it does ’ “ instead of actually saying a word (seemingly for various words) which o3 did not do.
    Wanted ' ". Ok. But forging above ' ". Ear illusions. Better: We'll ' ". Now final code steps: 5) After training we will Save the improved " ". structures: ' ". Now overshadow. But the illusions of ' ". Now to code. But we must ensure to maintain optimizer ' ". Ok. Now sedation. But we will maintain ' ". Now Balanced.
  - williawa 1 Oct 2025 16:01 UTC
    1 point
    0
    Parent
    I should have remembered, but I guess its the exact same evidence. Do you think that’s strong evidence gpt5 = continued train o3 + distillations?
    Are there any models we have a lot of unfiltered CoTs for, distinct from OpenAI, which display the same dialect-shift? And do they use the same strange tokens?
    I’ve only looked at deepseek and qwen cots, and they don’t have this strange way of talking.