Eliezer’s essay “Disputing Definitions” is didactic writing, but one can also read it as a lament. He even uses the word “mournful”. He ends his essay, like I started my comment, with making up two new words, intending to head off what he calls the “Standard Dispute”. His version is tongue-in-cheek. His words are “alberzle” and “bargulum” and there is a time machine.
His essay is excellent, but how does an essay from 2008 need updating for the LLM era? He is lamenting both what is in the training data and what is missing from it. People dispute definitions. They fail to invent new words to head off these disputes.
My claim is that many of our “Standard Disputes” have their origins in linguistic poverty. Enrich language with new words, targeting the ambiguities that we quarrel over, and the problems are solved at source. But turn to an LLM for help and it will help you to write fashionable prose. Since neologism have never been in fashion, (and are subject to mockery, see https://xkcd.com/483/) the LLM will not suggest any. Rather, it will guide you down the path of the “Standard Dispute”, leading you away from low hanging fruit.
For a whimsical speculation, imagine that the New York Times publishes a list of one hundred new words to enrich political discussion. Inventing new words becomes all the rage. In 2026 human authors who want to join in the craze will have to invent their own. In 2027 the linguistic patterns involved will be in the training data. In 2028 egglogs (egregious neologisms) are the hallmark of AI slop. In 2029 neologisms are banned and by 2030 we are back to disputing definitions, just like we did in 2025.
I agree with the main post. My narrow point on neologisms all that I have to add.
I don’t think AI would have trouble differentiating between the senses of “sound” (using Eliezer’s essay as an example)
But actually it seems like you’re saying:
Suppose we live in a world before people recognized a distinction between sound (audio waves) and sound (aural sensation). In this world, AI trained on the corpus of human text would not spontaneously generate this distinction (one, it doesn’t have the knowledge, and two, its dissuaded from even conjecturing it, because neologisms are taboo). But we don’t even need to ‘suppose’ this world exists—we do actually live in it now, it just applies to concepts more nuanced than “sound”.
I think neologisms are interesting because on one hand, it is annoying to see terms “astroturfed” (e.g., sonder),[1] or have insane mismatch between their sound and meaning (e.g., “grok” which people use as “to profoundly understand”, yet sounds more like a clunky word for a caveman’s lack of understanding. Its “etymology” is quite fitting (it’s supposed to be unrelatable),[2] but it’s a shame the term caught on).
On the other hand, I think much of the pursuit of knowledge is building towards finer and finer distinctions in our experience of reality. This necessitates new words.
For whatever reason, some morphologies seem more tasteful than others, such as ‘common extensions’ (e.g., ChatGPT → ChatGPTism), or ‘combining neoclassical compounds’ (e.g., xeno- + -cide = xenocide, from Ender’s Game), or even just ‘adding standard-word qualifiers’ (e.g., your example of splitting “defensive alliance” into “chaining alliance” and “isolating alliance”). I think most of the people who find success in coining terms probably do it in these more intuitive ways, rather than purely ‘random’ morphologies—find an excerpt from Nabeel Qureshi’s post, Reflections on Palantir:
One of my favorite insights from Tyler Cowen’s book ‘Talent’ is that the most talented people tend to develop their own vocabularies and memes, and these serve as entry points to a whole intellectual world constructed by that person. Tyler himself is of course a great example of this. Any MR reader can name 10+ Tylerisms instantly - ‘model this’, ‘context is that which is scarce’, ‘solve for the equilibrium’, ‘the great stagnation’ are all examples. You can find others who are great at this. Thiel is one. Elon is another (“multiplanetary species”, “preserving the light of consciousness”, etc. are all memes). Trump, Yudkowsky, gwern, SSC, Paul Graham, all of them regularly coin memes. It turns out that this is a good proxy for impact.
Robert A. Heinlein originally coined the term grok in his 1961 novel Stranger in a Strange Land as a Martian word that could not be defined in Earthling terms, but can be associated with various literal meanings such as “water”, “to drink”, “to relate”, “life”, or “to live”, and had a much more profound figurative meaning that is hard for terrestrial culture to understand because of its assumption of a singular reality.
Eliezer’s essay “Disputing Definitions” is didactic writing, but one can also read it as a lament. He even uses the word “mournful”. He ends his essay, like I started my comment, with making up two new words, intending to head off what he calls the “Standard Dispute”. His version is tongue-in-cheek. His words are “alberzle” and “bargulum” and there is a time machine.
His essay is excellent, but how does an essay from 2008 need updating for the LLM era? He is lamenting both what is in the training data and what is missing from it. People dispute definitions. They fail to invent new words to head off these disputes.
My claim is that many of our “Standard Disputes” have their origins in linguistic poverty. Enrich language with new words, targeting the ambiguities that we quarrel over, and the problems are solved at source. But turn to an LLM for help and it will help you to write fashionable prose. Since neologism have never been in fashion, (and are subject to mockery, see https://xkcd.com/483/) the LLM will not suggest any. Rather, it will guide you down the path of the “Standard Dispute”, leading you away from low hanging fruit.
For a whimsical speculation, imagine that the New York Times publishes a list of one hundred new words to enrich political discussion. Inventing new words becomes all the rage. In 2026 human authors who want to join in the craze will have to invent their own. In 2027 the linguistic patterns involved will be in the training data. In 2028 egglogs (egregious neologisms) are the hallmark of AI slop. In 2029 neologisms are banned and by 2030 we are back to disputing definitions, just like we did in 2025.
I agree with the main post. My narrow point on neologisms all that I have to add.
I see your point—at first I was thinking:
But actually it seems like you’re saying:
I think neologisms are interesting because on one hand, it is annoying to see terms “astroturfed” (e.g., sonder),[1] or have insane mismatch between their sound and meaning (e.g., “grok” which people use as “to profoundly understand”, yet sounds more like a clunky word for a caveman’s lack of understanding. Its “etymology” is quite fitting (it’s supposed to be unrelatable),[2] but it’s a shame the term caught on).
On the other hand, I think much of the pursuit of knowledge is building towards finer and finer distinctions in our experience of reality. This necessitates new words.
For whatever reason, some morphologies seem more tasteful than others, such as ‘common extensions’ (e.g., ChatGPT → ChatGPTism), or ‘combining neoclassical compounds’ (e.g., xeno- + -cide = xenocide, from Ender’s Game), or even just ‘adding standard-word qualifiers’ (e.g., your example of splitting “defensive alliance” into “chaining alliance” and “isolating alliance”). I think most of the people who find success in coining terms probably do it in these more intuitive ways, rather than purely ‘random’ morphologies—find an excerpt from Nabeel Qureshi’s post, Reflections on Palantir:
From the Dictionary of Obscure Sorrows, whose whole project is to coin new terms for phenomena which don’t yet have names.
Wikipedia: