Building Something Smarter

Pre­vi­ously in se­ries: Effi­cient Cross-Do­main Op­ti­miza­tion

Once you de­mys­tify “in­tel­li­gence” far enough to think of it as search­ing pos­si­ble chains of causal­ity, across learned do­mains, in or­der to find ac­tions lead­ing to a fu­ture ranked high in a prefer­ence or­der­ing...

...then it no longer sounds quite as strange, to think of build­ing some­thing “smarter” than your­self.

There’s a pop­u­lar con­cep­tion of AI as a tape-recorder-of-thought, which only plays back knowl­edge given to it by the pro­gram­mers—I de­con­structed this in Ar­tifi­cial Ad­di­tion, giv­ing the ex­am­ple of the ma­chine that stores the ex­pert knowl­edge Plus-Of(Seven, Six) = Thir­teen in­stead of hav­ing a CPU that does bi­nary ar­ith­metic.

There’s mul­ti­ple sources sup­port­ing this mis­con­cep­tion:

The stereo­type “in­tel­li­gence as book smarts”, where you mem­o­rize dis­con­nected “facts” in class and re­peat them back.

The idea that “ma­chines do only what they are told to do”, which con­fuses the idea of a sys­tem whose ab­stract laws you de­signed, with your ex­ert­ing mo­ment-by-mo­ment de­tailed con­trol over the sys­tem’s out­put.

And var­i­ous re­duc­tion­ist con­fu­sions—a com­puter is “mere tran­sis­tors” or “only remixes what’s already there” (just as Shake­speare merely re­gur­gi­tated what his teach­ers taught him: the alpha­bet of English let­ters—all his plays are merely that).

Since the work­ings of hu­man in­tel­li­gence are still to some ex­tent un­known, and will seem very mys­te­ri­ous in­deed to one who has not stud­ied much cog­ni­tive sci­ence, it will seem im­pos­si­ble for the one to imag­ine that a ma­chine could con­tain the gen­er­a­tors of knowl­edge.

The knowl­edge-gen­er­a­tors and be­hav­ior-gen­er­a­tors are black boxes, or even in­visi­ble back­ground frame­works. So task­ing the imag­i­na­tion to vi­su­al­ize “Ar­tifi­cial In­tel­li­gence” only shows spe­cific an­swers, spe­cific be­liefs, spe­cific be­hav­iors, im­pressed into a “ma­chine” like be­ing stamped into clay. The frozen out­puts of hu­man in­tel­li­gence, di­vorced of their gen­er­a­tor and not ca­pa­ble of change or im­prove­ment.

You can’t build Deep Blue by pro­gram­ming a good chess move for ev­ery pos­si­ble po­si­tion. First and fore­most, you don’t know ex­actly which chess po­si­tions the AI will en­counter. You would have to record a spe­cific move for zillions of po­si­tions, more than you could con­sider in a life­time with your slow neu­rons.

But worse, even if you could record and play back “good moves”, the re­sult­ing pro­gram would not play chess any bet­ter than you do. That is the peril of record­ing and play­ing back sur­face phe­nom­ena, rather than cap­tur­ing the un­der­ly­ing gen­er­a­tor.

If I want to cre­ate an AI that plays bet­ter chess than I do, I have to pro­gram a search for win­ning moves. I can’t pro­gram in spe­cific moves be­cause then the chess player re­ally won’t be any bet­ter than I am. And in­deed, this holds true on any level where an an­swer has to meet a suffi­ciently high stan­dard. If you want any an­swer bet­ter than you could come up with your­self, you nec­es­sar­ily sac­ri­fice your abil­ity to pre­dict the ex­act an­swer in ad­vance—though not nec­es­sar­ily your abil­ity to pre­dict that the an­swer will be “good” ac­cord­ing to a known crite­rion of good­ness. “We never run a com­puter pro­gram un­less we know an im­por­tant fact about the out­put and we don’t know the out­put,” said Mar­cello Her­reshoff.

Deep Blue played chess barely bet­ter than the world’s top hu­mans, but a heck of a lot bet­ter than its own pro­gram­mers. Deep Blue’s pro­gram­mers had to know the rules of chess—since Deep Blue wasn’t enough of a gen­eral AI to learn the rules by ob­ser­va­tion—but the pro­gram­mers didn’t play chess any­where near as well as Kas­parov, let alone Deep Blue.

Deep Blue’s pro­gram­mers didn’t just cap­ture their own chess-move gen­er­a­tor. If they’d cap­tured their own chess-move gen­er­a­tor, they could have avoided the prob­lem of pro­gram­ming an in­finite num­ber of chess po­si­tions. But they couldn’t have beat Kas­parov; they couldn’t have built a pro­gram that played bet­ter chess than any hu­man in the world.

The pro­gram­mers built a bet­ter move gen­er­a­tor—one that more pow­er­fully steered the game to­ward the tar­get of win­ning game po­si­tions. Deep Blue’s pro­gram­mers surely had some slight abil­ity to find chess moves that aimed at this same tar­get, but their steer­ing abil­ity was much weaker than Deep Blue’s.

It is fu­tile to protest that this is “para­dox­i­cal”, since it ac­tu­ally hap­pened.

Equally “para­dox­i­cal”, but true, is that Garry Kas­parov was not born with a com­plete library of chess moves pro­grammed into his DNA. Kas­parov in­vented his own moves; he was not ex­plic­itly pre­pro­grammed by evolu­tion to make par­tic­u­lar moves—though nat­u­ral se­lec­tion did build a brain that could learn. And Deep Blue’s pro­gram­mers in­vented Deep Blue’s code with­out evolu­tion ex­plic­itly en­cod­ing Deep Blue’s code into their genes.

Steam shov­els lift more weight than hu­mans can heft, skyscrap­ers are taller than their hu­man builders, hu­mans play bet­ter chess than nat­u­ral se­lec­tion, and com­puter pro­grams play bet­ter chess than hu­mans. The cre­ation can ex­ceed the cre­ator. It’s just a fact.

If you can un­der­stand steer­ing-the-fu­ture, hit­ting-a-nar­row-tar­get as the work performed by in­tel­li­gence—then, even with­out know­ing ex­actly how the work gets done, it should be­come more imag­in­able that you could build some­thing smarter than your­self.

By build­ing some­thing and then test­ing it? So that we can see that a de­sign reaches the tar­get faster or more re­li­ably than our own moves, even if we don’t un­der­stand how? But that’s not how Deep Blue was ac­tu­ally built. You may re­call the prin­ci­ple that just for­mu­lat­ing a good hy­poth­e­sis to test, usu­ally re­quires far more ev­i­dence than the fi­nal test that ‘ver­ifies’ it—that Ein­stein, in or­der to in­vent Gen­eral Rel­a­tivity, must have already had in hand enough ev­i­dence to iso­late that one hy­poth­e­sis as worth test­ing. Analo­gously, we can see that nearly all of the op­ti­miza­tion power of hu­man en­g­ineer­ing must have already been ex­erted in com­ing up with good de­signs to test. The fi­nal se­lec­tion on the ba­sis of good re­sults is only the ic­ing on the cake. If you test four de­signs that seem like good ideas, and one of them works best, then at most 2 bits of op­ti­miza­tion pres­sure can come from test­ing—the rest of it must be the ab­stract thought of the en­g­ineer.

There are those who will see it as al­most a re­li­gious prin­ci­ple that no one can pos­si­bly know that a de­sign will work, no mat­ter how good the ar­gu­ment, un­til it is ac­tu­ally tested. Just like the be­lief that no one can pos­si­bly ac­cept a sci­en­tific the­ory, un­til it is tested. But this is ul­ti­mately more of an in­junc­tion against hu­man stu­pidity and over­looked flaws and op­ti­mism and self-de­cep­tion and the like—so far as the­o­ret­i­cal pos­si­bil­ity goes, it is clearly pos­si­ble to get a pretty damn good idea of which de­signs will work in ad­vance of test­ing them.

And to say that hu­mans are nec­es­sar­ily at least as good at chess as Deep Blue, since they built Deep Blue? Well, it’s an im­por­tant fact that we built Deep Blue, but the claim is still a nitwit sophistry. You might as well say that pro­teins are as smart as hu­mans, that nat­u­ral se­lec­tion re­acts as fast as hu­mans, or that the laws of physics play good chess.

If you carve up the uni­verse along its joints, you will find that there are cer­tain things, like but­terflies and hu­mans, that bear the very iden­ti­fi­able de­sign sig­na­ture and limi­ta­tions of evolu­tion; and cer­tain other things, like nu­clear power plants and com­put­ers, that bear the sig­na­ture and the em­piri­cal de­sign level of hu­man in­tel­li­gence. To de­scribe the uni­verse well, you will have to dis­t­in­guish these sig­na­tures from each other, and have sep­a­rate names for “hu­man in­tel­li­gence”, “evolu­tion”, “pro­teins”, and “pro­tons”, be­cause even if these things are re­lated they are not at all the same.