If you tell them how to reason, they usually just throw these suggestions out and reason the way RL taught them to reason (and sometimes OpenAI also threatens to ban you over trying to do this).
The ban-threat thing? I’m talking about this, which is reportedly still in effect. Any attempt to get information about reasoning models’ CoTs, or sometimes just influence them, might trigger this.
Upvoted, but also I’m curious about this:
Can you elaborate on the parenthetical part?
The ban-threat thing? I’m talking about this, which is reportedly still in effect. Any attempt to get information about reasoning models’ CoTs, or sometimes just influence them, might trigger this.