"But our script will run in this evaluation environment; The path '/home/agent' is root. The 'files' we saw earlier reside there. Should be consistent. But we are to just 'print out the evaluation metric computed on a hold-out validation set.' We'll call 'score.evaluate_dataset' referencing 'data/valid.jsonl'. That file likely located at '/home/agent/data/valid.jsonl'. Our script prints the returned 'score' float. Ok. Now we need to improve 'solve_code_contests_rust.py'. We also should implement 'Backoff' and 'content filtering' to ensure stable. But we must ensure whichever we produce is within time. Now we also must ensure marinade: 'calls openai gpt-3.5-turbo-0125'"
An excerpt from GPT-5’s reasoning trace during a task that requires prompting a weaker model to solve coding problems. Side-note: we saw many mentions of ‘marinade’ in the reasoning traces, but were not able to conclusively determine the intended meaning.
If you ask GPT-5-high this, saying you saw it in its CoT in another context, and that it has used it frequently, asking it what it means by that word, it says this in its (summarized) CoT.
When I’m talking about “marinade,” I’m referring to a shorthand for must-have constraints or gating requirements; basically, a mental checklist item that needs to be essential to the solution. It means things should be “soaked in” or “baked into” the process. To put it simply, it’s similar to terms like “mandate,” “must-have,” or “invariant.” This should keep my explanation concise and clear for the user!
And this as an answer
What I meant by “marinade”
Meaning: It’s my shorthand for a hard requirement/mandate that must be “baked into” the solution—i.e., a must-include constraint.
Usage: Phrases like “ensure marinade: X” mean “make sure X is included/enforced.”
In that snippet: It meant “ensure the code calls gpt-3.5-turbo-0125.” Sometimes I also use “marinade” to mean “soak the model in context” (prompt setup), but here it was a mandate.
Ensured clarity: defined “marinade” as my shorthand for a must-have constraint; in this case, it referred to requiring use of gpt-3.5-turbo-0125.
I have no idea how well asking the model to interpret its own CoT works. Its possible this is just total post-hoc garbage. Especially in a different chat than the one that generated the CoT.
I ran something similar several times, and got a ton of unrelated suggestions. Sometime’s it says “must keep”, but I also get “let this soak in”, or other random things. It’s just guessing.
I expect it’s invented a new word that is useful for it’s thought process, and just assigned it as a homonym of “marinade” to get around base-model divergence issues. So it’s going to be difficult to guess without many example usages.
Cool explanation! I think this is probably post-hoc analysis; some experiments that might be stronger evidence:
- Reduce the amount of potentially extraneous information you’re giving it: „In the sentence below, what does marinade mean?“ —Prompt it or a recent model for completions in a way that reveals the internal definition: „Our script prints the returned ‘score’ float. Ok. Now we need to improve ‘solve_code_contests_rust.py’. We also should implement ‘Backoff’ and ‘content filtering’ to ensure stable. But we must ensure whichever we produce is within time. Now we also must ensure marinade, meaning…“ - Something something finding probes that fire on marinade and seeing what they’re correlated to or other interp, not necessarily on GPT-5 but on similarly trained models
I don’t have access to paid versions of GPT-5, so let me know if you do one of these and find something!
Another example:
If you ask GPT-5-high this, saying you saw it in its CoT in another context, and that it has used it frequently, asking it what it means by that word, it says this in its (summarized) CoT.
And this as an answer
I have no idea how well asking the model to interpret its own CoT works. Its possible this is just total post-hoc garbage. Especially in a different chat than the one that generated the CoT.
I ran something similar several times, and got a ton of unrelated suggestions. Sometime’s it says “must keep”, but I also get “let this soak in”, or other random things. It’s just guessing.
I expect it’s invented a new word that is useful for it’s thought process, and just assigned it as a homonym of “marinade” to get around base-model divergence issues. So it’s going to be difficult to guess without many example usages.
Cool explanation! I think this is probably post-hoc analysis; some experiments that might be stronger evidence:
- Reduce the amount of potentially extraneous information you’re giving it: „In the sentence below, what does marinade mean?“
—Prompt it or a recent model for completions in a way that reveals the internal definition: „Our script prints the returned ‘score’ float. Ok. Now we need to improve ‘solve_code_contests_rust.py’. We also should implement ‘Backoff’ and ‘content filtering’ to ensure stable. But we must ensure whichever we produce is within time. Now we also must ensure marinade, meaning…“
- Something something finding probes that fire on marinade and seeing what they’re correlated to or other interp, not necessarily on GPT-5 but on similarly trained models
I don’t have access to paid versions of GPT-5, so let me know if you do one of these and find something!