The world where LLMs are possible

In Artificial Addition, Eliezer used the ability to do arithmetic as a metaphor for intelligence. I really like this essay. It’s witty and enlightening. And yet I have to admit it aged not so well. Among several confused ways to think about artificial addition and by metaphor about artificial intelligence as well—he mentioned these:

  • “It’s a framing problem—what ‘twenty-one plus’ equals depends on whether it’s ‘plus three’ or ‘plus four’. If we can just get enough arithmetical facts stored to cover the common-sense truths that everyone knows, we’ll start to see real addition in the network.”

  • “But you’ll never be able to program in that many arithmetical facts by hiring experts to enter them manually. What we need is an Artificial Arithmetician that can learn the vast network of relations between numbers that humans acquire during their childhood by observing sets of apples.”

  • “No, what we really need is an Artificial Arithmetician that can understand natural language, so that instead of having to be explicitly told that twenty-one plus sixteen equals thirty-seven, it can get the knowledge by exploring the Web.”

Now we know that this approach to artificial intelligence actually works. LLMs are trained on a huge corpus of texts from the internet to learn the vast network of relations between concepts, which gives them the ability to understand natural language, and as a result they can perform well in a vast array of intelligence tasks. Ironically, they are still not very good at calculation, though.

What can we say about it in hindsight? What mistake in reasoning led to this bad prediction? Why did past Eliezer fail to anticipate LLMs? What lesson can we learn from it?

First of all, let’s remind ourselves why past Eiezer’s reasoning made sense.

  • Understanding language is a proxy target. You can map mathematical facts to language and treat them in a roundabout way, but this is going to be less accurate then addressing them directly in the medium specifically optimized for them.

  • Knowing a collection of facts satisfying a rule isn’t the same as knowing that rule. One can get the rule from the collection of facts via induction, but this a separate intellectual ability that you will have to embed into your system. It’s easier to figure out the rule yourself, as you already possess the ability to do induction and then embed the rule.

  • Addition is plain simpler than language. If you don’t know how to make a system that can do addition, you won’t be able to make one that understands language.

Now, in hindsight, I can see that this is where the metaphor breaks. Can you? I’ll let you think yourself about it for a while

.

.

.

.

.

.

.

.

.

.

.

.

Abilities to do language, arithmetic, induction are all part of a vast holistic concept that we call “intelligence”. Meanwhile, language and induction are not part of arithmetic. So, as a part is less complex than the whole, in terms of complexity we get something like this:

Arithmetic < Language < Intelligence

Arithmetic < Induction < Intelligence

And the road from language and induction to intelligence makes much more sense than from language and induction to arithmetic. And if all the knowledge of your civilization is encoded in language, including the rules of rationality itself, maybe this road is even one the best.

When framed like this, and in hindsight, the mistake may look silly. It may seem as if Eliezer just used an obviously unfitting metaphor and we didn’t notice it before, due to the halo effect. So the only lessons here are the fault of traductive reasoning and the dangers of trusting the authority. But I don’t think it’s the case.

I suppose, Eliezer thought that intelligence is simpler than a buch of separate abilities that people put in a bundle category. Not literally as simple as arithmetics, but probably less complicated than language. That there is some core property, from which all of the abilities we associate with intelligence can be achieved. Some simple principle that can be expressed through the math of Bayesian reasoning. Was it the case, the metaphor would’ve been completelly on point.

It’s not a priori clear whether it’s easier to reduce language to intelligence or intelligence to language. We see them co-occur in nature. But which way does the causality point? It does seem that some level of intelligence is required to develop language. Are we actually carving reality along its natural joints when we categorise “language” as an element of a larger set “intelligence”?

The “intelligence to language” position isn’t unreasonable. Actually, it still may be true! I’m not claiming that we’ve received a hard proof to the contrary. But we’ve got evidence. And we need to update on it. We live in the world where LLMs are possible. Where the road from language and inductive reasoning to intelligence seems clear. So let’s investigate its premises and implications. If LLMs are possible, what else is?