Artificial Addition

Suppose that human beings had absolutely no idea how they performed arithmetic. Imagine that human beings had evolved, rather than having learned, the ability to count sheep and add sheep. People using this built-in ability have no idea how it worked, the way Aristotle had no idea how his visual cortex supported his ability to see things. Peano Arithmetic as we know it has not been invented. There are philosophers working to formalize numerical intuitions, but they employ notations such as

Plus-Of(Seven, Six) = Thirteen

to formalize the intuitively obvious fact that when you add “seven” plus “six”, of course you get “thirteen”.

In this world, pocket calculators work by storing a giant lookup table of arithmetical facts, entered manually by a team of expert Artificial Arithmeticians, for starting values that range between zero and one hundred. While these calculators may be helpful in a pragmatic sense, many philosophers argue that they’re only simulating addition, rather than really adding. No machine can really count—that’s why humans have to count thirteen sheep before typing “thirteen” into the calculator. Calculators can recite back stored facts, but they can never know what the statements mean—if you type in “two hundred plus two hundred” the calculator says “Error: Outrange”, when it’s intuitively obvious, if you know what the words mean, that the answer is “four hundred”.

Philosophers, of course, are not so naive as to be taken in by these intuitions. Numbers are really a purely formal system—the label “thirty-seven” is meaningful, not because of any inherent property of the words themselves, but because the label refers to thirty-seven sheep in the external world. A number is given this referential property by its semantic network of relations to other numbers. That’s why, in computer programs, the LISP token for “thirty-seven” doesn’t need any internal structure—it’s only meaningful because of reference and relation, not some computational property of “thirty-seven” itself.

No one has ever developed an Artificial General Arithmetician, though of course there are plenty of domain-specific, narrow Artificial Arithmeticians that work on numbers between “twenty” and “thirty”, and so on. And if you look at how slow progress has been on numbers in the range of “two hundred”, then it becomes clear that we’re not going to get Artificial General Arithmetic any time soon. The best experts in the field estimate it will be at least a hundred years before calculators can add as well as a human twelve-year-old.

But not everyone agrees with this estimate, or with merely conventional beliefs about Artificial Arithmetic. It’s common to hear statements such as the following:

  • “It’s a framing problem—what ‘twenty-one plus’ equals depends on whether it’s ‘plus three’ or ‘plus four’. If we can just get enough arithmetical facts stored to cover the common-sense truths that everyone knows, we’ll start to see real addition in the network.”

  • “But you’ll never be able to program in that many arithmetical facts by hiring experts to enter them manually. What we need is an Artificial Arithmetician that can learn the vast network of relations between numbers that humans acquire during their childhood by observing sets of apples.”

  • “No, what we really need is an Artificial Arithmetician that can understand natural language, so that instead of having to be explicitly told that twenty-one plus sixteen equals thirty-seven, it can get the knowledge by exploring the Web.”

  • “Frankly, it seems to me that you’re just trying to convince yourselves that you can solve the problem. None of you really know what arithmetic is, so you’re floundering around with these generic sorts of arguments. ‘We need an AA that can learn X’, ‘We need an AA that can extract X from the Internet’. I mean, it sounds good, it sounds like you’re making progress, and it’s even good for public relations, because everyone thinks they understand the proposed solution—but it doesn’t really get you any closer to general addition, as opposed to domain-specific addition. Probably we will never know the fundamental nature of arithmetic. The problem is just too hard for humans to solve.”

  • “That’s why we need to develop a general arithmetician the same way Nature did—evolution.”

  • “Top-down approaches have clearly failed to produce arithmetic. We need a bottom-up approach, some way to make arithmetic emerge. We have to acknowledge the basic unpredictability of complex systems.”

  • “You’re all wrong. Past efforts to create machine arithmetic were futile from the start, because they just didn’t have enough computing power. If you look at how many trillions of synapses there are in the human brain, it’s clear that calculators don’t have lookup tables anywhere near that large. We need calculators as powerful as a human brain. According to Moore’s Law, this will occur in the year 2031 on April 27 between 4:00 and 4:30 in the morning.”

  • “I believe that machine arithmetic will be developed when researchers scan each neuron of a complete human brain into a computer, so that we can simulate the biological circuitry that performs addition in humans.”

  • “I don’t think we have to wait to scan a whole brain. Neural networks are just like the human brain, and you can train them to do things without knowing how they do them. We’ll create programs that will do arithmetic without we, our creators, ever understanding how they do arithmetic.”

  • “But Gödel’s Theorem shows that no formal system can ever capture the basic properties of arithmetic. Classical physics is formalizable, so to add two and two, the brain must take advantage of quantum physics.”

  • “Hey, if human arithmetic were simple enough that we could reproduce it in a computer, we wouldn’t be able to count high enough to build computers.”

  • “Haven’t you heard of John Searle’s Chinese Calculator Experiment? Even if you did have a huge set of rules that would let you add ‘twenty-one’ and ‘sixteen’, just imagine translating all the words into Chinese, and you can see that there’s no genuine addition going on. There are no real numbers anywhere in the system, just labels that humans use for numbers...”

There is more than one moral to this parable, and I have told it with different morals in different contexts. It illustrates the idea of levels of organization, for example—a CPU can add two large numbers because the numbers aren’t black-box opaque objects, they’re ordered structures of 32 bits.

But for purposes of overcoming bias, let us draw two morals:

  • First, the danger of believing assertions you can’t regenerate from your own knowledge.

  • Second, the danger of trying to dance around basic confusions.

Lest anyone accuse me of generalizing from fictional evidence, both lessons may be drawn from the real history of Artificial Intelligence as well.

The first danger is the object-level problem that the AA devices ran into: they functioned as tape recorders playing back “knowledge” generated from outside the system, using a process they couldn’t capture internally. A human could tell the AA device that “twenty-one plus sixteen equals thirty-seven”, and the AA devices could record this sentence and play it back, or even pattern-match “twenty-one plus sixteen” to output “thirty-seven!”, but the AA devices couldn’t generate such knowledge for themselves.

Which is strongly reminiscent of believing a physicist who tells you “Light is waves”, recording the fascinating words and playing them back when someone asks “What is light made of?”, without being able to generate the knowledge for yourself. More on this theme tomorrow.

The second moral is the meta-level danger that consumed the Artificial Arithmetic researchers and opinionated bystanders—the danger of dancing around confusing gaps in your knowledge. The tendency to do just about anything except grit your teeth and buckle down and fill in the damn gap.

Whether you say, “It is emergent!”, or whether you say, “It is unknowable!”, in neither case are you acknowledging that there is a basic insight required which is possessable, but unpossessed by you.

How can you know when you’ll have a new basic insight? And there’s no way to get one except by banging your head against the problem, learning everything you can about it, studying it from as many angles as possible, perhaps for years. It’s not a pursuit that academia is set up to permit, when you need to publish at least one paper per month. It’s certainly not something that venture capitalists will fund. You want to either go ahead and build the system now, or give up and do something else instead.

Look at the comments above: none are aimed at setting out on a quest for the missing insight which would make numbers no longer mysterious, make “twenty-seven” more than a black box. None of the commenters realized that their difficulties arose from ignorance or confusion in their own minds, rather than an inherent property of arithmetic. They were not trying to achieve a state where the confusing thing ceased to be confusing.

If you read Judea Pearl’s “Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference” then you will see that the basic insight behind graphical models is indispensable to problems that require it. (It’s not something that fits on a T-Shirt, I’m afraid, so you’ll have to go and read the book yourself. I haven’t seen any online popularizations of Bayesian networks that adequately convey the reasons behind the principles, or the importance of the math being exactly the way it is, but Pearl’s book is wonderful.) There were once dozens of “non-monotonic logics” awkwardly trying to capture intuitions such as “If my burglar alarm goes off, there was probably a burglar, but if I then learn that there was a small earthquake near my home, there was probably not a burglar.” With the graphical-model insight in hand, you can give a mathematical explanation of exactly why first-order logic has the wrong properties for the job, and express the correct solution in a compact way that captures all the common-sense details in one elegant swoop. Until you have that insight, you’ll go on patching the logic here, patching it there, adding more and more hacks to force it into correspondence with everything that seems “obviously true”.

You won’t know the Artificial Arithmetic problem is unsolvable without its key. If you don’t know the rules, you don’t know the rule that says you need to know the rules to do anything. And so there will be all sorts of clever ideas that seem like they might work, like building an Artificial Arithmetician that can read natural language and download millions of arithmetical assertions from the Internet.

And yet somehow the clever ideas never work. Somehow it always turns out that you “couldn’t see any reason it wouldn’t work” because you were ignorant of the obstacles, not because no obstacles existed. Like shooting blindfolded at a distant target—you can fire blind shot after blind shot, crying, “You can’t prove to me that I won’t hit the center!” But until you take off the blindfold, you’re not even in the aiming game. When “no one can prove to you” that your precious idea isn’t right, it means you don’t have enough information to strike a small target in a vast answer space. Until you know your idea will work, it won’t.

From the history of previous key insights in Artificial Intelligence, and the grand messes which were proposed prior to those insights, I derive an important real-life lesson: When the basic problem is your ignorance, clever strategies for bypassing your ignorance lead to shooting yourself in the foot.