The Magnitude of His Own Folly

In the years be­fore I met that would-be cre­ator of Ar­tifi­cial Gen­eral In­tel­li­gence (with a funded pro­ject) who hap­pened to be a cre­ation­ist, I would still try to ar­gue with in­di­vi­d­ual AGI wannabes.

In those days, I sort-of-suc­ceeded in con­vinc­ing one such fel­low that, yes, you had to take Friendly AI into ac­count, and no, you couldn’t just find the right fit­ness met­ric for an evolu­tion­ary al­gorithm. (Pre­vi­ously he had been very im­pressed with evolu­tion­ary al­gorithms.)

And the one said: Oh, woe! Oh, alas! What a fool I’ve been! Through my care­less­ness, I al­most de­stroyed the world! What a villain I once was!

Now, there’s a trap I knew I bet­ter than to fall into—

—at the point where, in late 2002, I looked back to Eliezer1997′s AI pro­pos­als and re­al­ized what they re­ally would have done, in­so­far as they were co­her­ent enough to talk about what they “re­ally would have done”.

When I fi­nally saw the mag­ni­tude of my own folly, ev­ery­thing fell into place at once. The dam against re­al­iza­tion cracked; and the un­spo­ken doubts that had been ac­cu­mu­lat­ing be­hind it, crashed through all to­gether. There wasn’t a pro­longed pe­riod, or even a sin­gle mo­ment that I re­mem­ber, of won­der­ing how I could have been so stupid. I already knew how.

And I also knew, all at once, in the same mo­ment of re­al­iza­tion, that to say, I al­most de­stroyed the world!, would have been too pride­ful.

It would have been too con­firm­ing of ego, too con­firm­ing of my own im­por­tance in the scheme of things, at a time when—I un­der­stood in the same mo­ment of re­al­iza­tion—my ego ought to be tak­ing a ma­jor punch to the stom­ach. I had been so much less than I needed to be; I had to take that punch in the stom­ach, not avert it.

And by the same to­ken, I didn’t fall into the con­ju­gate trap of say­ing: Oh, well, it’s not as if I had code and was about to run it; I didn’t re­ally come close to de­stroy­ing the world. For that, too, would have min­i­mized the force of the punch. It wasn’t re­ally loaded? I had pro­posed and in­tended to build the gun, and load the gun, and put the gun to my head and pull the trig­ger; and that was a bit too much self-de­struc­tive­ness.

I didn’t make a grand emo­tional drama out of it. That would have wasted the force of the punch, averted it into mere tears.

I knew, in the same mo­ment, what I had been care­fully not-do­ing for the last six years. I hadn’t been up­dat­ing.

And I knew I had to fi­nally up­date. To ac­tu­ally change what I planned to do, to change what I was do­ing now, to do some­thing differ­ent in­stead.

I knew I had to stop.

Halt, melt, and catch fire.

Say, “I’m not ready.” Say, “I don’t know how to do this yet.”

Th­ese are ter­ribly difficult words to say, in the field of AGI. Both the lay au­di­ence and your fel­low AGI re­searchers are in­ter­ested in code, pro­jects with pro­gram­mers in play. Failing that, they may give you some credit for say­ing, “I’m ready to write code, just give me the fund­ing.”

Say, “I’m not ready to write code,” and your sta­tus drops like a de­pleted ura­nium bal­loon.

What dis­t­in­guishes you, then, from six billion other peo­ple who don’t know how to cre­ate Ar­tifi­cial Gen­eral In­tel­li­gence? If you don’t have neat code (that does some­thing other than be hu­manly in­tel­li­gent, ob­vi­ously; but at least it’s code), or at min­i­mum your own startup that’s go­ing to write code as soon as it gets fund­ing—then who are you and what are you do­ing at our con­fer­ence?

Maybe later I’ll post on where this at­ti­tude comes from—the ex­cluded mid­dle be­tween “I know how to build AGI!” and “I’m work­ing on nar­row AI be­cause I don’t know how to build AGI”, the nonex­is­tence of a con­cept for “I am try­ing to get from an in­com­plete map of FAI to a com­plete map of FAI”.

But this at­ti­tude does ex­ist, and so the loss of sta­tus as­so­ci­ated with say­ing “I’m not ready to write code” is very great. (If the one doubts this, let them name any other who si­mul­ta­neously says “I in­tend to build an Ar­tifi­cial Gen­eral In­tel­li­gence”, “Right now I can’t build an AGI be­cause I don’t know X”, and “I am cur­rently try­ing to figure out X”.)

(And never mind AGIfolk who’ve already raised ven­ture cap­i­tal, promis­ing re­turns in five years.)

So there’s a huge re­luc­tance to say “Stop”. You can’t just say, “Oh, I’ll swap back to figure-out-X mode” be­cause that mode doesn’t ex­ist.

Was there more to that re­luc­tance than just loss of sta­tus, in my case? Eliezer2001 might also have flinched away from slow­ing his per­ceived for­ward mo­men­tum into the Sin­gu­lar­ity, which was so right and so nec­es­sary...

But mostly, I think I flinched away from not be­ing able to say, “I’m ready to start cod­ing.” Not just for fear of oth­ers’ re­ac­tions, but be­cause I’d been in­cul­cated with the same at­ti­tude my­self.

Above all, Eliezer2001 didn’t say “Stop”—even af­ter notic­ing the prob­lem of Friendly AI—be­cause I did not re­al­ize, on a gut level, that Na­ture was al­lowed to kill me.

“Teenagers think they’re im­mor­tal”, the proverb goes. Ob­vi­ously this isn’t true in the literal sense that if you ask them, “Are you in­de­struc­tible?” they will re­ply “Yes, go ahead and try shoot­ing me.” But per­haps wear­ing seat belts isn’t deeply emo­tion­ally com­pel­ling for them, be­cause the thought of their own death isn’t quite real—they don’t re­ally be­lieve it’s al­lowed to hap­pen. It can hap­pen in prin­ci­ple but it can’t ac­tu­ally hap­pen.

Per­son­ally, I always wore my seat belt. As an in­di­vi­d­ual, I un­der­stood that I could die.

But, hav­ing been raised in technophilia to trea­sure that one most pre­cious thing, far more im­por­tant than my own life, I once thought that the Fu­ture was in­de­struc­tible.

Even when I ac­knowl­edged that nan­otech could wipe out hu­man­ity, I still be­lieved the Sin­gu­lar­ity was in­vuln­er­a­ble. That if hu­man­ity sur­vived, the Sin­gu­lar­ity would hap­pen, and it would be too smart to be cor­rupted or lost.

Even af­ter that, when I ac­knowl­edged Friendly AI as a con­sid­er­a­tion, I didn’t emo­tion­ally be­lieve in the pos­si­bil­ity of failure, any more than that teenager who doesn’t wear their seat belt re­ally be­lieves that an au­to­mo­bile ac­ci­dent is re­ally al­lowed to kill or crip­ple them.

It wasn’t un­til my in­sight into op­ti­miza­tion let me look back and see Eliezer1997 in plain light, that I re­al­ized that Na­ture was al­lowed to kill me.

“The thought you can­not think con­trols you more than thoughts you speak aloud.” But we flinch away from only those fears that are real to us.

AGI re­searchers take very se­ri­ously the prospect of some­one else solv­ing the prob­lem first. They can imag­ine see­ing the head­lines in the pa­per say­ing that their own work has been up­staged. They know that Na­ture is al­lowed to do that to them. The ones who have started com­pa­nies know that they are al­lowed to run out of ven­ture cap­i­tal. That pos­si­bil­ity is real to them, very real; it has a power of emo­tional com­pul­sion over them.

I don’t think that “Oops” fol­lowed by the thud of six billion bod­ies fal­ling, at their own hands, is real to them on quite the same level.

It is un­safe to say what other peo­ple are think­ing. But it seems rather likely that when the one re­acts to the prospect of Friendly AI by say­ing, “If you de­lay de­vel­op­ment to work on safety, other pro­jects that don’t care at all about Friendly AI will beat you to the punch,” the prospect of they them­selves mak­ing a mis­take fol­lowed by six billion thuds, is not re­ally real to them; but the pos­si­bil­ity of oth­ers beat­ing them to the punch is deeply scary.

I, too, used to say things like that, be­fore I un­der­stood that Na­ture was al­lowed to kill me.

In that mo­ment of re­al­iza­tion, my child­hood technophilia fi­nally broke.

I fi­nally un­der­stood that even if you dili­gently fol­lowed the rules of sci­ence and were a nice per­son, Na­ture could still kill you. I fi­nally un­der­stood that even if you were the best pro­ject out of all available can­di­dates, Na­ture could still kill you.

I un­der­stood that I was not be­ing graded on a curve. My gaze shook free of ri­vals, and I saw the sheer blank wall.

I looked back and I saw the care­ful ar­gu­ments I had con­structed, for why the wis­est choice was to con­tinue for­ward at full speed, just as I had planned to do be­fore. And I un­der­stood then that even if you con­structed an ar­gu­ment show­ing that some­thing was the best course of ac­tion, Na­ture was still al­lowed to say “So what?” and kill you.

I looked back and saw that I had claimed to take into ac­count the risk of a fun­da­men­tal mis­take, that I had ar­gued rea­sons to tol­er­ate the risk of pro­ceed­ing in the ab­sence of full knowl­edge.

And I saw that the risk I wanted to tol­er­ate would have kil­led me. And I saw that this pos­si­bil­ity had never been re­ally real to me. And I saw that even if you had wise and ex­cel­lent ar­gu­ments for tak­ing a risk, the risk was still al­lowed to go ahead and kill you. Ac­tu­ally kill you.

For it is only the ac­tion that mat­ters, and not the rea­sons for do­ing any­thing. If you build the gun and load the gun and put the gun to your head and pull the trig­ger, even with the clever­est of ar­gu­ments for car­ry­ing out ev­ery step—then, bang.

I saw that only my own ig­no­rance of the rules had en­abled me to ar­gue for go­ing ahead with­out com­plete knowl­edge of the rules; for if you do not know the rules, you can­not model the penalty of ig­no­rance.

I saw that oth­ers, still ig­no­rant of the rules, were say­ing “I will go ahead and do X”; and that to the ex­tent that X was a co­her­ent pro­posal at all, I knew that would re­sult in a bang; but they said, “I do not know it can­not work”. I would try to ex­plain to them the smal­l­ness of the tar­get in the search space, and they would say “How can you be so sure I won’t win the lot­tery?”, wield­ing their own ig­no­rance as a blud­geon.

And so I re­al­ized that the only thing I could have done to save my­self, in my pre­vi­ous state of ig­no­rance, was to say: “I will not pro­ceed un­til I know pos­i­tively that the ground is safe.” And there are many clever ar­gu­ments for why you should step on a piece of ground that you don’t know to con­tain a land­mine; but they all sound much less clever, af­ter you look to the place that you pro­posed and in­tended to step, and see the bang.

I un­der­stood that you could do ev­ery­thing that you were sup­posed to do, and Na­ture was still al­lowed to kill you. That was when my last trust broke. And that was when my train­ing as a ra­tio­nal­ist be­gan.