A Premature Word on AI

Followup to: A.I. Old-Timers, Do Scientists Already Know This Stuff?

In response to Robin Hanson’s post on the disillusionment of old-time AI researchers such as Roger Schank, I thought I’d post a few premature words on AI, even though I’m not really ready to do so:

Anyway:

I never expected AI to be easy. I went into the AI field because I thought it was world-crackingly important, and I was willing to work on it if it took the rest of my whole life, even though it looked incredibly difficult.

I’ve noticed that folks who actively work on Artificial General Intelligence, seem to have started out thinking the problem was much easier than it first appeared to me.

In retrospect, if I had not thought that the AGI problem was worth a hundred and fifty thousand human lives per day—that’s what I thought in the beginning—then I would not have challenged it; I would have run away and hid like a scared rabbit. Everything I now know about how to not panic in the face of difficult problems, I learned from tackling AGI, and later, the superproblem of Friendly AI, because running away wasn’t an option.

Try telling one of these AGI folks about Friendly AI, and they reel back, surprised, and immediately say, “But that would be too difficult!” In short, they have the same run-away reflex as anyone else, but AGI has not activated it. (FAI does.)

Roger Schank is not necessarily in this class, please note. Most of the people currently wandering around in the AGI Dungeon are those too blind to see the warning signs, the skulls on spikes, the flaming pits. But e.g. John McCarthy is a warrior of a different sort; he ventured into the AI Dungeon before it was known to be difficult. I find that in terms of raw formidability, the warriors who first stumbled across the Dungeon, impress me rather more than most of the modern explorers—the first explorers were not self-selected for folly. But alas, their weapons tend to be extremely obsolete.

There are many ways to run away from difficult problems. Some of them are exceedingly subtle.

What makes a problem seem impossible? That no avenue of success is available to mind. What makes a problem seem scary? That you don’t know what to do next.

Let’s say that the problem of creating a general intelligence seems scary, because you have no idea how to do it.You could run away by working on chess-playing programs instead. Or you could run away by saying, “All past AI projects failed due to lack of computing power.” Then you don’t have to face the unpleasant prospect of staring at a blank piece of paper until drops of blood form on your forehead—the best description I’ve ever heard of the process of searching for core insight. You have avoided placing yourself into a condition where your daily work may consist of not knowing what to do next.

But “Computing power!” is a mysterious answer to a mysterious question. Even after you believe that all past AI projects failed “due to lack of computing power”, it doesn’t make intelligence any less mysterious. “What do you mean?” you say indignantly, “I have a perfectly good explanation for intelligence: it emerges from lots of computing power! Or knowledge! Or complexity!” And this is a subtle issue to which I must probably devote more posts. But if you contrast the rush of insight into details and specifics that follows from learning about, say, Pearlian causality, you may realize that “Computing power causes intelligence” does not constrain detailed anticipation of phenomena even in retrospect.

People are not systematically taught what to do when they’re scared; everyone’s got to work it out on their own. And so the vast majority stumble into simple traps like mysterious answers or affective death spirals. I too stumbled, but I managed to recover and get out alive; and realized what it was that I’d learned; and then I went back into the Dungeon, because I had something to protect.

I’ve recently discussed how scientists are not taught to handle chaos, so I’m emphasizing that aspect in this particular post, as opposed to a dozen other aspects… If you want to appreciate the inferential distances here, think of how odd all this would sound without the Einstein sequence. Then think of how odd the Einstein sequence would have sounded without the many-worlds sequence… There’s plenty more where that came from.

What does progress in AGI/FAI look like, if not bigger and faster computers?

It looks like taking down the real barrier, the scary barrier, the one where you have to sweat blood: understanding things that seem mysterious, and not by declaring that they’re “emergent” or “complex”, either.

If you don’t understand the family of Cox’s Theorems and the Dutch Book argument, you can go round and round with “certainty factors” and “fuzzy logics” that seem sorta appealing, but that can never quite be made to work right. Once you understand the structure of probability—not just probability as an explicit tool, but as a forced implicit structure in cognitive engines—even if the structure is only approximate—then you begin to actually understand what you’re doing; you are not just trying things that seem like good ideas. You have achieved core insight. You are not even limited to floating-point numbers between 0 and 1 to represent probability; you have seen through to structure, and can use log odds or smoke signals if you wish.

If you don’t understand graphical models of conditional independence, you can go round and round inventing new “default logics” and “defeasible logics” that get more and more complicated as you try to incorporate an infinite number of special cases. If you know the graphical structure, and why the graphical model works, and the regularity of the environment that it exploits, and why it is efficient as well as correct, then you really understand the problem; you are not limited to explicit Bayesian networks, you just know that you have to exploit a certain kind of mathematical regularity in the environment.

Unfortunately, these two insights—Bayesian probability and Pearlian causality—are far from sufficient to solve general AI problems. If you try to do anything with these two theories that requires an additional key insight you do not yet possess, you will fail just like any other AGI project, and build something that grows more and more complicated and patchworky but never quite seems to work the way you hoped.

These two insights are examples of what “progress in AI” looks like.

Most people who say they intend to tackle AGI do not understand Bayes or Pearl. Most of the people in the AI Dungeon are there because they think they found the Sword of Truth in an old well, or, even worse, because they don’t realize the problem is difficult. They are not polymaths; they are not making a convulsive desperate effort to solve the unsolvable. They are optimists who have their Great Idea that is the best idea ever even though they can’t say exactly how it will produce intelligence, and they want to do the scientific thing and test their hypothesis. If they hadn’t started out thinking they already had the Great Idea, they would have run away from the Dungeon; but this does not give them much of a motive to search for other master keys, even the ones already found.

The idea of looking for an “additional insight you don’t already have” is something that the academic field of AI is just not set up to do. As a strategy, it does not result in a reliable success (defined as a reliable publication). As a strategy, it requires additional study and large expenditures of time. It ultimately amounts to “try to be Judea Pearl or Laplace” and that is not something that professors have been reliably taught to teach undergraduates; even though it is often what a field in a state of scientific chaos needs.

John McCarthy said quite well what Artificial Intelligence needs: 1.7 Einsteins, 2 Maxwells, 5 Faradays and .3 Manhattan Projects. From this I am forced to subtract the “Manhattan project”, because security considerations of FAI prohibit using that many people; but I doubt it’ll take more than another 1.5 Maxwells and 0.2 Faradays to make up for it.

But, as said, the field of AI is not set up to support this—it is set up to support explorations with reliable payoffs.

You would think that there would be genuinely formidable people going into the Dungeon of Generality, nonetheless, because they wanted to test their skills against true scientific chaos. Even if they hadn’t yet realized that their little sister is down there. Well, that sounds very attractive in principle, but I guess it sounds a lot less attractive when you have to pay the rent. Or they’re all off doing string theory, because AI is well-known to be impossible, not the sort of chaos that looks promising—why, it’s genuinely scary! You might not succeed, if you went in there!

But I digress. This began as a response to Robin Hanson’s post “A.I. Old-Timers”, and Roger Shank’s very different idea of what future AI progress will look like.

Okay, let’s take a look at Roger Schank’s argument:

I have not soured on AI. I still believe that we can create very intelligent machines. But I no longer believe that those machines will be like us… What AI can and should build are intelligent special purpose entities. (We can call them Specialized Intelligences or SI’s.) Smart computers will indeed be created. But they will arrive in the form of SI’s, ones that make lousy companions but know every shipping accident that ever happened and why (the shipping industry’s SI) or as an expert on sales (a business world SI.)

I ask the fundamental question of rationality: Why do you believe what you believe?

Schank would seem to be talking as if he knows something about the course of future AI research—research that hasn’t happened yet. What is it that he thinks he knows? How does he think he knows it?

As John McCarthy said: “Your statements amount to saying that if AI is possible, it should be easy. Why is that?”

There is a master strength behind all human arts: Human intelligence can, without additional adaptation, create the special-purpose systems of a skyscraper, a gun, a space shuttle, a nuclear weapon, a DNA synthesizer, a high-speed computer...

If none of what the human brain does is magic, the combined trick of it can be recreated in purer form.

If this can be done, someone will do it. The fact that shipping-inventory programs can be built as well, does not mean that it is sensible to talk about people only building shipping-inventory programs. If it is also possible to build something of human+ power. In a world where both events occur, the course of history is dominated by the latter.

So what is it that Roger Schank learned, as Bayesian evidence, which confirms some specific hypothesis over its alternatives—and what is the hypothesis, exactly? - that reveals to him the future course of AI research? Namely, that AI will not succeed in creating anything of general capability?

It would seem rather difficult to predict the future course of research you have not yet done. Wouldn’t Schank have to know the best solution in order to know the minimum time the best solution would take?

Of course I don’t think Schank is actually doing a Bayesian update here. I think Roger Schank gives the game away when he says:

When reporters interviewed me in the 70′s and 80′s about the possibilities for Artificial Intelligence I would always say that we would have machines that are as smart as we are within my lifetime. It seemed a safe answer since no one could ever tell me I was wrong.

There is careful futurism, where you try to consider all the biases you know, and separate your analysis into logical parts, and put confidence intervals around things, and use wider confidence intervals where you have less constraining knowledge, and all that other stuff rationalists do. Then there is sloppy futurism, where you just make something up that sounds neat. This sounds like sloppy futurism to me.

So, basically, Schank made a fantastic amazing futuristic prediction about machines “as smart as we are” “within my lifetime”—two phrases that themselves reveal some shaky assumptions.

Then Schank got all sad and disappointed because he wasn’t making progress as fast as he hoped.

So Schank made a different futuristic prediction, about special-purpose AIs that will answer your questions about shipping disasters. It wasn’t quite as shiny and futuristic, but it matched his new saddened mood, and it gave him something to say to reporters when they asked him where AI would be in 2050.

This is how the vast majority of futurism is done. So until I have reason to believe there is something more to Schank’s analysis than this, I don’t feel very guilty about disagreeing with him when I make “predictions” like:

If you don’t know much about a problem, you should widen your confidence intervals in both directions. AI seems very hard because you don’t know how to do it. But translating this into a confident prediction of a very long time interval would express your ignorance as if it were positive knowledge. So even though AI feels very hard to you, this is an expression of ignorance that should translate into a confidence interval wide in both directions: the less you know, the broader that confidence interval should be, in both directions.

Or:

You don’t know what theoretical insights will be required for AI, or you would already have them. Theoretical breakthroughs can happen without advance warning (the warning is perceived in retrospect, of course, but not in advance); and they can be arbitrarily large. We know it is difficult to build a star from hydrogen atoms in the obvious way—because we understand how stars work, so we know that the work required is a huge amount of drudgery.

Or:

Looking at the anthropological trajectory of hominids seems to strongly contradict the assertion that exponentially increasing amounts of processing power or programming time are required for the production of intelligence in the vicinity of human; even when using an evolutionary algorithm that runs on blind mutations, random recombination, and selection with zero foresight.

But if I don’t want this post to go on forever, I had better stop it here. See this paper, however.