The issue of meaning in large language models (LLMs)

Cross posted from New Savanna.

Noam Chomsky was trotted out to write a dubious op-ed in The New York Times about large language models and Scott Aaronson registered his displeasure at his blog, Shtetl-Optimized: The False Promise of Chomskyism. A vigorous and sometimes-to-often insightful conversation ensued. I wrote four longish comments (so far). I’m reproducing two of them below, which are about meaning in LLMs.

Meaning in LLMs (there isn’t any)
Comment #120 March 10th, 2023 at 2:26 pm

@Scott #85: Ah, that’s a relief. So:

But I think the importantly questions now shift, to ones like: how, exactly does gradient descent on next-token prediction manage to converge on computational circuits that encode generative grammar, so well that GPT essentially never makes a grammatical error?

It’s not clear to me whether or not that’s important to linguistics generally, but it is certainly important for deep learning. My guess – and that’s all it is – is that if more people get working on the question, that we can make good progress on answering it. It’s even possible that in, say five years or so, people will no longer be saying LLMs are inscrutable black boxes. I’m not saying that we’ll fully understand what’s going on; only that we will understand a lot more than we do know and are confident of making continuing progress.

Why do I believe that? I sense a stirring in the Force.

There’s that crazy ass discussion at LessWrong that Eric Saund mentioned in #113. I mean, I wish that place weren’t so darned insular and insisting on doing everything themselves, but it is what it is. I don’t know whether you’ve seen Stephen Wolfram’s long article (and accompanying video) but has some nice visualizations of the trajectory GP-2 takes in completing sentences and is thinking in terms of complex dynamic – he talks of “attractors” and “attractor basins” – and seems to be thinking of getting into it himself. I found a recent dissertation in Spain that’s about the need to interpret ANNs in terms of complex dynamics, which includes a review of an older literature on the subject. I think that’s going to be part of the story.

And a strange story it is. There is a very good reason why some people say that LLMs aren’t dealing with meaning despite that fact that they produce fluent prose on all kinds of subjects. If they aren’t dealing with meaning, then how can they produce the prose?

The fact is that the materials LLMs are trained on don’t themselves have any meaning.

How could I possibly say such a silly thing? They’re trained on texts just like any other texts. Of course they have meaning.

But texts do not in fact contain meaning within themselves. If they did, you’d be able to read texts in a foreign language and understand them perfectly. No, meaning exists in the heads of people who read texts. And that’s the only place meaning exists.

Words consist of word forms, which are physical, and meanings, with are mental. Word forms take the form of sound waves, graphical objects, physical gestures, and various other forms as well. In the digital world ASII encoding is common. I believe that for machine learning purposes we use byte-pair encoding, whatever that is. The point is, there are no meanings there, anywhere. Just some physical signal.

As a thought experiment, imagine that we transform every text string into a string of colored dots. We use a unique color for each word and are consistent across the whole collection of texts. What we have then is a bunch of one-dimensional visual objects. You can run all those colored strings through a transformer engine and end up with a model of the distribution of colored dots in dot-space. That model will be just like a language model. And can be prompted in the same way, except that you have to use strings of colored dots.

THAT’s what we have to understand.

As I’ve said, there’s no meaning in there anywhere. Just colored dots in a space of very high dimensionality.

And yet, if you replace those dots with the corresponding words...SHAZAM! You can read it. All of a sudden your brain induces meanings that were invisible when it was just strings of colored dots.

I spend a fair amount of time thinking about that in the paper I wrote when GPT-3 came out, GPT-3: Waterloo or Rubicon? Here be Dragons, though not in those terms. The central insight comes from Sydney Lamb, a first-generation computational linguist: If you conceive of language as existing in a relational network, then the meaning of a word is a function of its position in the network. I spend a bit of time unpacking that in the paper (particularly pp. 15–19) so there’s no point trying to summarize it here.

But if you think in those terms, then something like this

king – man + woman ≈ queen

is not startling. The fact is, when I first encountered that I WAS surprised for a second or two and then I thought, yeah, that makes sense. If you had asked me whether that sort of thing was possible before I had actually seen it done, I don’t know how I would have replied. But, given how I think about these things, I might have thought it possible.

In any event, it has happened, and I’m fine with it even if I can’t offer much more than sophisticated hand-waving and tap-dancing by way of explanation. I feel the same way about ChatGPT. I can’t explain it, but it is consistent with how I have come to think about the mind and cognition. I don’t see any reason why we can’t made good progress in figuring out what LLMs are up to. We just have to put our minds to the task and do the work.

B333 asks: how does meaning get in people’s heads anyway?
Comment #135 March 10th, 2023 at 6:12 pm

@Bill Benzon 120

Ok, well if meaning isn’t in texts, but only in people’s heads, how does meaning get in people’s heads anyway? Mental events occur as physical processes in the brain, and one could well wonder how a physical process in the brain “means” or has the “content” of something external.

Language is highly patterned, and that pattern is an (imperfect) map of reality. “The man rode the horse” is a more likely sentence than “The horse rode the man” because humans actually ride horses, not vice verse. If we switched out words for colored dots those correspondences would still hold. So there is in fact an awful lot of information about reality encoded in raw text.

Meaning = intention + semanticity
Comment #152 March 11th, 2023 at 7:58 am

@B333 #135: “...how does meaning get in people’s heads anyway?” From other people’s heads in various ways, one of which is language. The key concept is in your last sentence, “encoded.” For language to work, you have to know the code. If you can neither speak nor read Mandarin, that is, if you don’t know the code, then you have no access to meanings encoded in Mandarin.

Transformer engines don’t know the code of any of the languages deployed in the texts they train on. What they do is create a proxy for meaning by locating word forms at specific positions in a high-dimensional space. Given enough dimensions, those positions encode the relationality aspect of (word) meaning.

I have come to think of meaning as consisting of an intentional component and a semantic component. The semantic component in turn consists of a relational component and an adhesion component. (I discuss those three in an appendix to the dragons paper I linked in #120.)

Take this sentence: “John is absent today.” Spoken with one intonation pattern it means just what it says. But when you use a different intonation pattern, it functions as a question. The semanticity is the same in each case. This sentence: “That’s a bright idea.” With one intonation pattern it means just that. But if you use a different intonation pattern is means the idea is stupid.

Adhesion is what links a concept to the world. There are a lot of concepts about physical phenomena as apprehended by the senses. The adhesions of those concepts are thus specified by the sensory percepts. But there are a lot of concepts that are abstractly defined. You can’t see, hear, smell, taste or touch truth, beauty, love, or justice. But you can tell stories about all of them. Plato’s best-known dialog, Republic, is about justice.

And then we have salt, on the one hand, and NaCl on the other. Both are physical substances. Salt is defined by sensory impressions, with taste being the most important one. NaCl is abstractly defined in terms of a chemical theory that didn’t exist, I believe, until the 19th century. The notion of a molecule consisting of an atom of sodium and an atom of chlorine is quite abstract and took a long time and a lot of experimentation and observation to figure out. The observations had to be organized and discipline by logic and mathematics. That’s a lot of conceptual machinery.

Note that not only are “salt” and “NaCl” defined differently, but they have different extensions in the world. NaCl is by definition a pure substance. Salt is not pure. It consists mostly of NaCl plus a variety of impurities. You pay more for salt that has just the right impurities and texture to make it artisanal.

Relationality is the relations that words have with one another. Pine, oak, maple, and palm are all kinds of trees. Trees grow and die. They can be chopped down and they can be burned. And so forth, through the whole vocabulary. These concepts have different kinds of relationships with one another – which have been well-studied in linguistics and in classical era symbolic models.

If each of those concepts is characterized by a vector with a sufficient number of components, they can be easily distinguished from one another in the vector space. And we can perform operations on them by working with vectors. Any number of techniques have been built on that insight going back to Gerald Salton’s work on document retrieval in the 1970s. Let’s say we have collection of scientific articles. Let’s encode each abstract as a vector. One then queries the collection by issuing a natural language query which is also encoded as a vector. The query vector is then matched against the set of document vectors and the documents having the best matches are returned.

It turns out that if the vectors are large enough, you can produce a very convincing simulacrum of natural language. Welcome to the wonderful and potentially very useful world of contemporary LLMs.

[Caveat: from this point on I’m beginning to make this up off the top of my head. Sentence and discourse structure have been extensively studied, but I’m not attempting to do anything remotely resembling even the sketchiest of short accounts of that literature.]

Let’s go back to the idea of encoding the relational aspect of word meaning as points in a high-dimensional space. When we speak or write, we “take a walk” though that space and emit that path as a string, a one-dimensional list of tokens. The listener or reader then has to take in that one-dimensional list and map the tokens to the appropriate locations in relational semantic space. How is that possible?

Syntax is a big part of the story. The words in a sentence play different roles and so are easy to distinguish from one another. Various syntactic devices – word order, the uses of suffixes and prefixes, function words (articles and prepositions) – help us to assemble them in the right configuration so as to preserve the meaning.

Things are different above the sentence level. The proper ordering of sentences is a big part of it. If you take a perfectly coherent chunk of text and scramble the order of the sentences, it becomes unintelligible. There are more specific devices as well, such as conventions for pronominal reference.

A quantitative relationship between concepts, dimensions, and token strings

Now, it seems to me that we’d like to have a way of thinking about quantitative relationships [at this point my temperature parameter is moving higher and higher] between 1) Concepts: the number of distinct concepts in a vocabulary, 2) Dimensions: the number of dimensions in the vector space in which you embed those concepts, and 3) Token strings: the number of tokens an engine needs to train on in order to locate the map the tokens to the proper positions (i.e. types) in the vector space so that they are distinguished from one another and in the proper relationship.

What do I mean by “distinct concepts” & what about Descartes’ “clear and distinct ideas”? I don’t quite know. Can the relationality of words be resolved into orthogonal dimensions in vector space? I don’t know. But Peter Gärdenfors has been working on it and I’d recommend that people working LLMs become familiar with his work: Conceptual Spaces: The Geometry of Thought (MIT 2000), The Geometry of Meaning: Semantics Based on Conceptual Spaces (MIT 2014). If you do a search on his name you’ll come up with a bunch of more recent papers.

And of course there is more to word meaning than what you’ll find in the dictionary, which is more or less what is captured in the vector space I’ve been describing to this point. Those “core” meanings are refined, modified, and extended in discourse. That gives us the distinction between semantic and episodic knowledge (which Eric Saund mentioned in #113). The language model has to deal with that as well. That means more parameters, lots more.

I have no idea what it’s going to take to figure out those relationships. But I don’t see why we can’t make substantial progress in a couple of years. Providing, of course, that people actually work on the problem.

Addendum: What about the adhesions of abstract concepts?
Added 3.12.23

Within the semanticity component of meaning I have distinguished between adhesion and relationality: “Adhesion is what links a concept to the world” and “relationality is the relations that words have with one another.” But what about the adhesion of words that are not directly defined in relation to the physical world? Since they are defined over other words, doesn’t their adhesion reduce to relationality?

Not really. Take David Hays’s standard example: Charity is when someone does something nice for someone else without thought of reward. Any story that satisfies the terms of that definition (“when someone does...reward”) is considered an act of charity. The adhesion of the definiendum, charity, is not with any of the words, either in the definiens or in any of the stories that satisfy the definiens, but with the pattern exhibited by the words. It’s the pattern that characterizes the connection to the world, not the individual words in stories or in the defining pattern.