One possible contributor: posttraining involves chat transcripts in the desired style (often, nowadays, generated by an older LLM), and I suspect that in learning to imitate the format models also learn to imitate the tone (and to overfit, at that; perhaps it’s due to having only a few examples relative to the size of the corpus, but this is merely idle speculation). (The consensus on twitter seemed to be that “delve” in particular was a consequence of human writing; it’s used far more commonly in African English than in American, and OpenAI outsourced data labeling to save on costs.) I haven’t noticed nearly as much of a consistent flavor in my limited experimentation with base models, so I think posttraining must make it worse even if it’s not the cause.
One possible contributor: posttraining involves chat transcripts in the desired style (often, nowadays, generated by an older LLM), and I suspect that in learning to imitate the format models also learn to imitate the tone (and to overfit, at that; perhaps it’s due to having only a few examples relative to the size of the corpus, but this is merely idle speculation). (The consensus on twitter seemed to be that “delve” in particular was a consequence of human writing; it’s used far more commonly in African English than in American, and OpenAI outsourced data labeling to save on costs.) I haven’t noticed nearly as much of a consistent flavor in my limited experimentation with base models, so I think posttraining must make it worse even if it’s not the cause.