I have been assuming that the OpenAI reasoning models were trained on an objective that had a CoT length term, and that that would create pressure to strip out unnecessary tokens. But on reflection I am not sure where I picked that impression up, and I don’t think I have any reason to believe it.
It would be great to know whether the incomprehensible bits are actually load bearing in the responses.
… I wonder what happens if you alter the logit bias of those. Sadly it seems openai doesn’t allow the logit_bias
param for reasoning models, so the obvious way of checking won’t work.
I’m curious why you wouldn’t expect that. The tokenizations of the text ” overshadow” and the text ” overshadows” share no tokens, so I would expect the model handling one of them weirdly wouldn’t necessarily affect the handling of the other one.