AlphaAndOmega comments on Viliam’s Shortform

AlphaAndOmega 5 Sep 2025 1:13 UTC
11 points
3
There is an important distinction to be made between base models, which are next token predictors (and write in a very human like manner) and the chatbots people normally use, despite both being called “LLMs”. Those involve taking a base model and then subjecting it to further instruction tuning and Reinforcement Learning from Human Feedback.
My understanding is that a lot of the stylistic quirks of the latter arise from the fact that OpenAI (and other companies) either personally or through outsourcing hired Third World contractors. They were cheap and reasonably fluent in English. However, a large number were Nigerian, and this resulted in a model slightly biased towards Nigerian English. They’re far more fond of “delve” than Western English speakers, but most people wouldn’t know that, because how often were they reading Nigerian literature or media? It’s particularly rampant in their business or corporate comms.
Then you have other issues stemming from RLHF. Certain aspects of the general chatbot persona were overly engrained, as preference feedback biased towards verbosity and enthusiasm. And now that the output of ChatGPT 3.5 and onwards is all over the wider internet, to an extent, new base models expect that that’s how a chatbot talks. This likely bleeds into the chatbots that are built on the contaminated base model, since they have very strong priors that they’re a chatbot in the first place.
That doesn’t mean that there isn’t significant variance in how current models speak. Claude has a rather distinct personality and innate voice, even within the various GPT models, o3 was highly terse and fond of dense jargon. With the availability of custom instructions, you can easily get them to talk you in just about any style you prefer.