AI Risk & Opportunity: A Timeline of Early Ideas and Arguments

Part of the se­ries AI Risk and Op­por­tu­nity: A Strate­gic Anal­y­sis.

(You can leave anony­mous feed­back on posts in this se­ries here. I alone will read the com­ments, and may use them to im­prove past and forth­com­ing posts in this se­ries.)

Build­ing on the pre­vi­ous post on AI risk his­tory, this post pro­vides an in­com­plete timeline (up to 1993) of sig­nifi­cant novel ideas and ar­gu­ments re­lated to AI as a po­ten­tial catas­trophic risk. I do not in­clude ideas and ar­gu­ments con­cern­ing only, for ex­am­ple, the pos­si­bil­ity of AI (Tur­ing 1950) or at­tempts to pre­dict its ar­rival (Bostrom 1998).

As is usu­ally the case, we find that when we look closely at a cluster of ideas, it turns out these ideas did not ap­pear all at once in the minds of a Few Great Men. In­stead, they grew and mu­tated and gave birth to new ideas grad­u­ally as they passed from mind to mind over the course of many decades.

1863: Ma­chine in­tel­li­gence as an ex­is­ten­tial risk to hu­man­ity; re­lin­quish­ment of ma­chine tech­nol­ogy recom­mended. Sa­muel But­ler in Dar­win among the ma­chines wor­ries that as we build in­creas­ingly so­phis­ti­cated and au­tonomous ma­chines, they will achieve greater ca­pa­bil­ity than hu­mans and re­place hu­mans as the dom­i­nant agents on the planet:

...we are our­selves cre­at­ing our own suc­ces­sors; we are daily adding to the beauty and del­i­cacy of their phys­i­cal or­gani­sa­tion; we are daily giv­ing them greater power and sup­ply­ing by all sorts of in­ge­nious con­trivances that self-reg­u­lat­ing, self-act­ing power which will be to them what in­tel­lect has been to the hu­man race. In the course of ages we shall find our­selves the in­fe­rior race… the time will come when the ma­chines will hold the real supremacy over the world and its in­hab­itants...

Our opinion is that war to the death should be in­stantly pro­claimed against them. Every ma­chine of ev­ery sort should be de­stroyed by the well-wisher of his species. Let there be no ex­cep­tions made, no quar­ter shown...

(See also But­ler 1872; Camp­bell 1932.)

1921: Robots as an ex­is­ten­tial risk. The Czech play R.U.R. by Karel Capek tells the story of robots which grow in power and in­tel­li­gence and de­stroy the en­tire hu­man race (ex­cept for a sin­gle sur­vivor).

1947: Frag­ility & com­plex­ity of hu­man val­ues (in the con­text of ma­chine goal sys­tems); per­verse in­stan­ti­a­tion. Jack Willi­am­son’s nov­el­ette With Folded Hands (1947) tells the story of a race of ma­chines that, in or­der to fol­low the Prime Direc­tive: “to serve and obey and guard men from harm.” To obey this rule, the ma­chines in­terfere with ev­ery as­pect of hu­man life, and hu­mans who re­sist are lobotomized. Due to the frag­ility and com­plex­ity of hu­man val­ues (Yud­kowsky 2008; Muehlhauser and Helm 2012), the ma­chines’ rules of be­hav­ior had un­in­tended con­se­quences, man­i­fest­ing a “per­verse in­stan­ti­a­tion” in the lan­guage of Bostrom (forth­com­ing).

(Also see Asi­mov 1950, 1957, 1983; Versenyi 1974; Min­sky 1984; Yud­kowsky 2001, 2011.)

1948-1949: Pre­cur­sor idea to in­tel­li­gence ex­plo­sion. Von Neu­mann (1948) wrote:

...“com­pli­ca­tion” on its lower lev­els is prob­a­bly de­gen­er­a­tive, that is, that ev­ery au­toma­ton that can pro­duce other au­tomata will only be able to pro­duce less com­pli­cated ones. There is, how­ever, a cer­tain min­i­mum level where this de­gen­er­a­tive char­ac­ter­is­tic ceases to be uni­ver­sal. At this point au­tomata which can re­pro­duce them­selves, or even con­struct higher en­tities, be­come pos­si­ble.

Von Nue­mann (1949) came very close to ar­tic­u­lat­ing the idea of in­tel­li­gence ex­plo­sion:

There is thus this com­pletely de­ci­sive prop­erty of com­plex­ity, that there ex­ists a crit­i­cal size be­low which the pro­cess of syn­the­sis is de­gen­er­a­tive, but above which the phe­nomenon of syn­the­sis, if prop­erly ar­ranged, can be­come ex­plo­sive, in other words, where syn­the­ses of au­tomata can pro­ceed in such a man­ner that each au­toma­ton will pro­duce other au­tomata which are more com­plex and of higher po­ten­tial­ities than it­self.

1951: Po­ten­tially rapid tran­si­tion from ma­chine in­tel­li­gence to ma­chine takeover. Tur­ing (1951) de­scribed ways that in­tel­li­gent com­put­ers might learn and im­prove their ca­pa­bil­ities, con­clud­ing that: seems prob­a­ble that once the ma­chine think­ing method has started, it would not take long to out­strip our fee­ble pow­ers… At some stage there­fore we should have to ex­pect the ma­chines to take con­trol...

1959: In­tel­li­gence ex­plo­sion; the need for hu­man-friendly goals for ma­chine su­per­in­tel­li­gence. Good (1959) de­scribes what he later (1965) called an “in­tel­li­gence ex­plo­sion,” a par­tic­u­lar mechanism for rapid tran­si­tion from ar­tifi­cial gen­eral in­tel­li­gence to dan­ger­ous ma­chine takeover:

Once a ma­chine is de­signed that is good enough… it can be put to work de­sign­ing an even bet­ter ma­chine. At this point an “ex­plo­sion” will clearly oc­cur; all the prob­lems of sci­ence and tech­nol­ogy will be handed over to ma­chines and it will no longer be nec­es­sary for peo­ple to work. Whether this will lead to a Utopia or to the ex­ter­mi­na­tion of the hu­man race will de­pend on how the prob­lem is han­dled by the ma­chines. The im­por­tant thing will be to give them the aim of serv­ing hu­man be­ings.

(Also see Good 1962, 1965, 1970; Vinge 1992, 1993; Yud­kowsky 2008.)

1966: A mil­i­tary arms race for ma­chine su­per­in­tel­li­gence could ac­cel­er­ate ma­chine takeover; con­ver­gence to­ward a sin­gle­ton is likely. Den­nis Feltham Jones’ 1966 novel Colos­sus de­picted what may be a par­tic­u­larly likely sce­nario: two world su­per­pow­ers (the USA and USSR) are in an arms race to de­velop su­per­in­tel­li­gent com­put­ers, one of which self-im­proves enough to take con­trol of the planet.

In the same year, Cade (1966) ar­gued the same thing:

poli­ti­cal lead­ers on Earth will slowly come to re­al­ize… that in­tel­li­gent ma­chines hav­ing su­per­hu­man think­ing abil­ity can be built. The con­struc­tion of such ma­chines, even tak­ing into ac­count all the lat­est de­vel­op­ments in com­puter tech­nol­ogy, would call for a ma­jor na­tional effort. It is only to be ex­pected that any na­tion which did put forth the fi­nan­cial and phys­i­cal effort needed to build and pro­gramme such a ma­chine, would also at­tempt to uti­lize it to its max­i­mum ca­pac­ity, which im­plies that it would be used to make ma­jor de­ci­sions of na­tional policy. Here is where the awful dilemma arises. Any re­stric­tion to the range of data sup­plied to the ma­chine would limit its abil­ity to make effec­tive poli­ti­cal and eco­nomic de­ci­sions, yet if no such re­stric­tions are placed upon the ma­chine’s com­mand of in­for­ma­tion, then the en­tire con­trol of the na­tion would vir­tu­ally be sur­ren­dered to the judg­ment of the robot.

On the other hand, any ma­jor na­tion which was led by a su­pe­rior, un­emo­tional in­tel­li­gence of any kind, would quickly rise to a po­si­tion of world dom­i­na­tion. This by it­self is suffi­cient to guaran­tee that, sooner or later, the effort to build such an in­tel­li­gence will be made — if not in the Western world, then el­se­where, where peo­ple are more ac­cus­tomed to iron dic­ta­tor­ships.

...It seems that, in the forsee­able fu­ture, the ma­jor na­tions of the world will have to face the al­ter­na­tive of sur­ren­der­ing na­tional con­trol to me­chan­i­cal ministers, or be­ing dom­i­nated by other na­tions which have already done this. Such a pro­cess will even­tu­ally lead to the dom­i­na­tion of the whole Earth by a dic­ta­tor­ship of an un­par­alleled type — a sin­gle supreme cen­tral au­thor­ity.

(This last para­graph also ar­gues for con­ver­gence to­ward what Bostrom later called a “sin­gle­ton.”)

(Also see Elli­son 1967.)

1970: Pro­posal for an as­so­ci­a­tion that an­a­lyzes the im­pli­ca­tions of ma­chine su­per­in­tel­li­gence; naive con­trol solu­tions like “switch off the power” may not work be­cause the su­per­in­tel­li­gence will out­smart us, thus we must fo­cus on its mo­ti­va­tions; pos­si­bil­ity of “pointless” op­ti­miza­tion by ma­chine su­per­in­tel­li­gence. Good (1970) ar­gues:

Even if the chance that the ul­train­tel­li­gent ma­chine will be available [soon] is small, the reper­cus­sions would be so enor­mous, good or bad, that it is not too early to en­ter­tain the pos­si­bil­ity. In any case by 1980 I hope that the im­pli­ca­tions and the safe­guards will have been thor­oughly dis­cussed, and this is my main rea­son for airing the mat­ter: an as­so­ci­a­tion for con­sid­er­ing it should be started.

(Also see Bostrom 1997.)

On the idea that naive con­trol solu­tions like “switch off the power” may not work be­cause the su­per­in­tel­li­gence will find a way to out­smart us, and thus we must fo­cus our efforts on the su­per­in­tel­li­gence’s mo­ti­va­tions, Good writes:

Some peo­ple have sug­gested that in or­der to pre­vent the [ul­train­tel­li­gent ma­chine] from tak­ing over we should be ready to switch of its power sup­ply. But it is not as sim­ple as that be­cause the ma­chine could recom­mend the ap­point­ment of its own op­er­a­tors, it could recom­mend that they be paid well and it could se­lect older men who would not be wor­ried about los­ing their jobs. Then it could re­place its op­er­a­tors by robots in or­der to make sure that it is not switched off. Next it could have the neo-Lud­dites ridiculed by call­ing them Lud­diteniks, and if nec­es­sary it would later have them im­pris­oned or ex­e­cuted. This shows how care­ful we must be to keep our eye on the “mo­ti­va­tion” of the ma­chines, if pos­si­ble, just as we should with poli­ti­ci­ans.

(Also see Yud­kowsky 2008.)

Good also out­lines one pos­si­bil­ity for “pointless” goal-op­ti­miza­tion by ma­chine su­per­in­tel­li­gence:

If the ma­chines took over and men be­came re­dun­dant and ul­ti­mately ex­tinct, the so­ciety of ma­chines would con­tinue in a com­plex and in­ter­est­ing man­ner, but it would all ap­par­ently be pointless be­cause there would be no one there to be in­ter­ested. If ma­chines can­not be con­scious there would be only a zom­bie world. This would per­haps not be as bad as in many hu­man so­cieties where most peo­ple have lived in mis­ery and degra­da­tion while a few have lived in pomp and lux­ury. It seems to me that the util­ity of such so­cieties has been nega­tive (while in the con­di­tion de­scribed) whereas the util­ity of a zom­bie so­ciety would be zero and hence prefer­able.

(Also see Bostrom 2004; Yud­kowsky 2008.)

1974: We can’t much pre­dict what will hap­pen af­ter the cre­ation of ma­chine su­per­in­tel­li­gence. Julius Lukasiewicz (1974) writes:

The sur­vival of man may de­pend on the early con­struc­tion of an ul­train­tel­li­gent ma­chine-or the ul­train­tel­li­gent ma­chine may take over and ren­der the hu­man race re­dun­dant or de­velop an­other form of life. The prospect that a merely in­tel­li­gent man could ever at­tempt to pre­dict the im­pact of an ul­train­tel­li­gent de­vice is of course un­likely but the temp­ta­tion to spec­u­late seems ir­re­sistible.

(Also see Vinge 1993.)

1977: Self-im­prov­ing AI could stealthily take over the in­ter­net; con­ver­gent in­stru­men­tal goals in AI; the treach­er­ous turn. Though the con­cept of a self-prop­a­gat­ing com­puter worm was in­tro­duced by John Brun­ner’s The Shock­wave Rider (1975), Thomas J. Ryan’s novel The Ado­les­cence of P-1 (1977) tells the story of an in­tel­li­gent worm that at first is merely able to learn to hack novel com­puter sys­tems and use them to prop­a­gate it­self, but later (1) has novel in­sights on how to im­prove its own in­tel­li­gence, (2) de­vel­ops con­ver­gent in­stru­men­tal sub­goals (see Bostrom 2012) for self-preser­va­tion and re­source ac­qui­si­tion, and (3) learns the abil­ity to fake its own death so that it can grow its pow­ers in se­cret and later en­gage in a “treach­er­ous turn” (see Bostrom forth­com­ing) against hu­mans.

1982: To de­sign eth­i­cal ma­chine su­per­in­tel­li­gence, we may need to de­sign su­per­in­tel­li­gence first and then ask it to solve philo­soph­i­cal prob­lems (e.g. in­clud­ing ethics).

Good (1982) writes:

Un­for­tu­nately, af­ter 2500 years, the philo­soph­i­cal prob­lems are nowhere near solu­tion. Do we need to solve these philo­soph­i­cal prob­lems be­fore we can de­sign an ad­e­quate eth­i­cal ma­chine, or is there an­other ap­proach? One ap­proach that can­not be ruled out is first to pro­duce an ul­tra-in­tel­li­gent ma­chine and then ask it to solve philo­soph­i­cal prob­lems.

1988: Even though AI poses an ex­is­ten­tial threat, we may need to rush to­ward it so we can use it to miti­gate other ex­is­ten­tial threats. Mo­ravec (1988, p. 100-101) writes:­tel­li­gent ma­chines… threaten our ex­is­tence… Machines merely as clever as hu­man be­ings will have enor­mous ad­van­tages in com­pet­i­tive situ­a­tions… So why rush head­long into an era of in­tel­li­gent ma­chines? The an­swer, I be­lieve, is that we have very lit­tle choice, if our cul­ture is to re­main vi­able… The uni­verse is one ran­dom event af­ter an­other. Sooner or later an un­stop­pable virus deadly to hu­mans will evolve, or a ma­jor as­ter­oid will col­lide with the earth, or the sun will ex­pand, or we will be in­vaded from the stars, or a black hole will swal­low the galaxy. The big­ger, more di­verse, and com­pe­tent a cul­ture is, the bet­ter it can de­tect and deal with ex­ter­nal dan­gers. The larger events hap­pen less fre­quently. By grow­ing rapidly enough, a cul­ture has a finite chance of sur­viv­ing for­ever.

1993: Phys­i­cal con­fine­ment is un­likely to con­strain su­per­in­tel­li­gences, for su­per­in­tel­li­gences will out­smart us. Vinge (1993) writes:

I ar­gue that con­fine­ment [of su­per­in­tel­li­gent ma­chines] is in­trin­si­cally im­prac­ti­cal. For the case of phys­i­cal con­fine­ment: Imag­ine your­self con­fined to your house with only limited data ac­cess to the out­side, to your mas­ters. If those mas­ters thought at a rate — say — one mil­lion times slower than you, there is lit­tle doubt that over a pe­riod of years (your time) you could come up with “helpful ad­vice” that would in­ci­den­tally set you free...

After 1993. The ex­tropi­ans mailing list was launched in 1991, and was home to hun­dreds of dis­cus­sions in which many im­por­tant new ideas were pro­posed — ideas later de­vel­oped in the pub­lic writ­ings of Bostrom, Yud­kowsky, Go­ertzel, and oth­ers. Un­for­tu­nately, the dis­cus­sions from be­fore 1998 were pri­vate, by agree­ment among sub­scribers. The early years of the archive can­not be made pub­lic with­out get­ting per­mis­sion from ev­ery­one in­volved — a nearly im­pos­si­ble task. I have, how­ever, col­lected all posts I could find from 1998 on­ward and up­loaded them here (link fixed 04-03-2012).

I will end this post here. Per­haps in a fu­ture post I will ex­tend the timeline past 1993, when in­ter­est in the sub­ject be­came greater and thus the num­ber of new ideas gen­er­ated per decade rapidly in­creased.