Stanford Encyclopedia of Philosophy on AI ethics and superintelligence

Link post

The Stan­ford En­cy­clo­pe­dia of Philos­o­phy—pretty much the stan­dard refer­ence for sur­veys of philo­soph­i­cal top­ics—has a brand-new (“First pub­lished Thu Apr 30, 2020) ar­ti­cle, “Ethics of Ar­tifi­cial In­tel­li­gence and Robotics”. Sec­tion 2.10 is called “Sin­gu­lar­ity”. I think it has a rea­son­ably fair and com­pe­tent sum­mary of su­per­in­tel­li­gence dis­cus­sion:

----

2.10 Singularity

2.10.1 Sin­gu­lar­ity and Superintelligence

In some quar­ters, the aim of cur­rent AI is thought to be an “ar­tifi­cial gen­eral in­tel­li­gence” (AGI), con­trasted to a tech­ni­cal or “nar­row” AI. AGI is usu­ally dis­t­in­guished from tra­di­tional no­tions of AI as a gen­eral pur­pose sys­tem, and from Searle’s no­tion of “strong AI”:

com­put­ers given the right pro­grams can be liter­ally said to un­der­stand and have other cog­ni­tive states. (Searle 1980: 417)

The idea of sin­gu­lar­ity is that if the tra­jec­tory of ar­tifi­cial in­tel­li­gence reaches up to sys­tems that have a hu­man level of in­tel­li­gence, then these sys­tems would them­selves have the abil­ity to de­velop AI sys­tems that sur­pass the hu­man level of in­tel­li­gence, i.e., they are “su­per­in­tel­li­gent” (see be­low). Such su­per­in­tel­li­gent AI sys­tems would quickly self-im­prove or de­velop even more in­tel­li­gent sys­tems. This sharp turn of events af­ter reach­ing su­per­in­tel­li­gent AI is the “sin­gu­lar­ity” from which the de­vel­op­ment of AI is out of hu­man con­trol and hard to pre­dict (Kurzweil 2005: 487).

The fear that “the robots we cre­ated will take over the world” had cap­tured hu­man imag­i­na­tion even be­fore there were com­put­ers (e.g., But­ler 1863) and is the cen­tral theme in Čapek’s fa­mous play that in­tro­duced the word “robot” (Čapek 1920). This fear was first for­mu­lated as a pos­si­ble tra­jec­tory of ex­ist­ing AI into an “in­tel­li­gence ex­plo­sion” by Irvin Good:

Let an ul­train­tel­li­gent ma­chine be defined as a ma­chine that can far sur­pass all the in­tel­lec­tual ac­tivi­ties of any man how­ever clever. Since the de­sign of ma­chines is one of these in­tel­lec­tual ac­tivi­ties, an ul­train­tel­li­gent ma­chine could de­sign even bet­ter ma­chines; there would then un­ques­tion­ably be an “in­tel­li­gence ex­plo­sion”, and the in­tel­li­gence of man would be left far be­hind. Thus the first ul­train­tel­li­gent ma­chine is the last in­ven­tion that man need ever make, pro­vided that the ma­chine is docile enough to tell us how to keep it un­der con­trol. (Good 1965: 33)

The op­ti­mistic ar­gu­ment from ac­cel­er­a­tion to sin­gu­lar­ity is spel­led out by Kurzweil (1999, 2005, 2012) who es­sen­tially points out that com­put­ing power has been in­creas­ing ex­po­nen­tially, i.e., dou­bling ca. ev­ery 2 years since 1970 in ac­cor­dance with “Moore’s Law” on the num­ber of tran­sis­tors, and will con­tinue to do so for some time in the fu­ture. He pre­dicted in (Kurzweil 1999) that by 2010 su­per­com­put­ers will reach hu­man com­pu­ta­tion ca­pac­ity, by 2030 “mind up­load­ing” will be pos­si­ble, and by 2045 the “sin­gu­lar­ity” will oc­cur. Kurzweil talks about an in­crease in com­put­ing power that can be pur­chased at a given cost—but of course in re­cent years the funds available to AI com­pa­nies have also in­creased enor­mously: Amodei and Her­nan­dez (2018 [OIR]) thus es­ti­mate that in the years 2012–2018 the ac­tual com­put­ing power available to train a par­tic­u­lar AI sys­tem dou­bled ev­ery 3.4 months, re­sult­ing in an 300,000x in­crease—not the 7x in­crease that dou­bling ev­ery two years would have cre­ated.

A com­mon ver­sion of this ar­gu­ment (Chalmers 2010) talks about an in­crease in “in­tel­li­gence” of the AI sys­tem (rather than raw com­put­ing power), but the cru­cial point of “sin­gu­lar­ity” re­mains the one where fur­ther de­vel­op­ment of AI is taken over by AI sys­tems and ac­cel­er­ates be­yond hu­man level. Bostrom (2014) ex­plains in some de­tail what would hap­pen at that point and what the risks for hu­man­ity are. The dis­cus­sion is sum­marised in Eden et al. (2012); Arm­strong (2014); Shana­han (2015). There are pos­si­ble paths to su­per­in­tel­li­gence other than com­put­ing power in­crease, e.g., the com­plete em­u­la­tion of the hu­man brain on a com­puter (Kurzweil 2012; Sand­berg 2013), biolog­i­cal paths, or net­works and or­gani­sa­tions (Bostrom 2014: 22–51).

De­spite ob­vi­ous weak­nesses in the iden­ti­fi­ca­tion of “in­tel­li­gence” with pro­cess­ing power, Kurzweil seems right that hu­mans tend to un­der­es­ti­mate the power of ex­po­nen­tial growth. Mini-test: If you walked in steps in such a way that each step is dou­ble the pre­vi­ous, start­ing with a step of one me­tre, how far would you get with 30 steps? (an­swer: to Earth’s only per­ma­nent nat­u­ral satel­lite.) In­deed, most progress in AI is read­ily at­tributable to the availa­bil­ity of pro­ces­sors that are faster by de­grees of mag­ni­tude, larger stor­age, and higher in­vest­ment (Müller 2018). The ac­tual ac­cel­er­a­tion and its speeds are dis­cussed in (Müller and Bostrom 2016; Bostrom, Dafoe, and Flynn forth­com­ing); Sand­berg (2019) ar­gues that progress will con­tinue for some time.

The par­ti­ci­pants in this de­bate are united by be­ing technophiles in the sense that they ex­pect tech­nol­ogy to de­velop rapidly and bring broadly wel­come changes—but be­yond that, they di­vide into those who fo­cus on benefits (e.g., Kurzweil) and those who fo­cus on risks (e.g., Bostrom). Both camps sym­pa­thise with “tran­shu­man” views of sur­vival for hu­mankind in a differ­ent phys­i­cal form, e.g., up­loaded on a com­puter (Mo­ravec 1990, 1998; Bostrom 2003a, 2003c). They also con­sider the prospects of “hu­man en­hance­ment” in var­i­ous re­spects, in­clud­ing in­tel­li­gence—of­ten called “IA” (in­tel­li­gence aug­men­ta­tion). It may be that fu­ture AI will be used for hu­man en­hance­ment, or will con­tribute fur­ther to the dis­solu­tion of the neatly defined hu­man sin­gle per­son. Robin Han­son pro­vides de­tailed spec­u­la­tion about what will hap­pen eco­nom­i­cally in case hu­man “brain em­u­la­tion” en­ables truly in­tel­li­gent robots or “ems” (Han­son 2016).

The ar­gu­ment from su­per­in­tel­li­gence to risk re­quires the as­sump­tion that su­per­in­tel­li­gence does not im­ply benev­olence—con­trary to Kan­tian tra­di­tions in ethics that have ar­gued higher lev­els of ra­tio­nal­ity or in­tel­li­gence would go along with a bet­ter un­der­stand­ing of what is moral and bet­ter abil­ity to act morally (Gewirth 1978; Chalmers 2010: 36f). Ar­gu­ments for risk from su­per­in­tel­li­gence say that ra­tio­nal­ity and moral­ity are en­tirely in­de­pen­dent di­men­sions—this is some­times ex­plic­itly ar­gued for as an “or­thog­o­nal­ity the­sis” (Bostrom 2012; Arm­strong 2013; Bostrom 2014: 105–109).

Crit­i­cism of the sin­gu­lar­ity nar­ra­tive has been raised from var­i­ous an­gles. Kurzweil and Bostrom seem to as­sume that in­tel­li­gence is a one-di­men­sional prop­erty and that the set of in­tel­li­gent agents is to­tally-or­dered in the math­e­mat­i­cal sense—but nei­ther dis­cusses in­tel­li­gence at any length in their books. Gen­er­ally, it is fair to say that de­spite some efforts, the as­sump­tions made in the pow­er­ful nar­ra­tive of su­per­in­tel­li­gence and sin­gu­lar­ity have not been in­ves­ti­gated in de­tail. One ques­tion is whether such a sin­gu­lar­ity will ever oc­cur—it may be con­cep­tu­ally im­pos­si­ble, prac­ti­cally im­pos­si­ble or may just not hap­pen be­cause of con­tin­gent events, in­clud­ing peo­ple ac­tively pre­vent­ing it. Philo­soph­i­cally, the in­ter­est­ing ques­tion is whether sin­gu­lar­ity is just a “myth” (Floridi 2016; Ganascia 2017), and not on the tra­jec­tory of ac­tual AI re­search. This is some­thing that prac­ti­tion­ers of­ten as­sume (e.g., Brooks 2017 [OIR]). They may do so be­cause they fear the pub­lic re­la­tions back­lash, be­cause they over­es­ti­mate the prac­ti­cal prob­lems, or be­cause they have good rea­sons to think that su­per­in­tel­li­gence is an un­likely out­come of cur­rent AI re­search (Müller forth­com­ing-a). This dis­cus­sion raises the ques­tion whether the con­cern about “sin­gu­lar­ity” is just a nar­ra­tive about fic­tional AI based on hu­man fears. But even if one does find nega­tive rea­sons com­pel­ling and the sin­gu­lar­ity not likely to oc­cur, there is still a sig­nifi­cant pos­si­bil­ity that one may turn out to be wrong. Philos­o­phy is not on the “se­cure path of a sci­ence” (Kant 1791: B15), and maybe AI and robotics aren’t ei­ther (Müller 2020). So, it ap­pears that dis­cussing the very high-im­pact risk of sin­gu­lar­ity has jus­tifi­ca­tion even if one thinks the prob­a­bil­ity of such sin­gu­lar­ity ever oc­cur­ring is very low.

2.10.2 Ex­is­ten­tial Risk from Superintelligence

Think­ing about su­per­in­tel­li­gence in the long term raises the ques­tion whether su­per­in­tel­li­gence may lead to the ex­tinc­tion of the hu­man species, which is called an “ex­is­ten­tial risk” (or XRisk): The su­per­in­tel­li­gent sys­tems may well have prefer­ences that con­flict with the ex­is­tence of hu­mans on Earth, and may thus de­cide to end that ex­is­tence—and given their su­pe­rior in­tel­li­gence, they will have the power to do so (or they may hap­pen to end it be­cause they do not re­ally care).

Think­ing in the long term is the cru­cial fea­ture of this liter­a­ture. Whether the sin­gu­lar­ity (or an­other catas­trophic event) oc­curs in 30 or 300 or 3000 years does not re­ally mat­ter (Baum et al. 2019). Per­haps there is even an as­tro­nom­i­cal pat­tern such that an in­tel­li­gent species is bound to dis­cover AI at some point, and thus bring about its own demise. Such a “great filter” would con­tribute to the ex­pla­na­tion of the “Fermi para­dox” why there is no sign of life in the known uni­verse de­spite the high prob­a­bil­ity of it emerg­ing. It would be bad news if we found out that the “great filter” is ahead of us, rather than an ob­sta­cle that Earth has already passed. Th­ese is­sues are some­times taken more nar­rowly to be about hu­man ex­tinc­tion (Bostrom 2013), or more broadly as con­cern­ing any large risk for the species (Rees 2018)—of which AI is only one (Häg­gström 2016; Ord 2020). Bostrom also uses the cat­e­gory of “global catas­trophic risk” for risks that are suffi­ciently high up the two di­men­sions of “scope” and “sever­ity” (Bostrom and Ćirković 2011; Bostrom 2013).

Th­ese dis­cus­sions of risk are usu­ally not con­nected to the gen­eral prob­lem of ethics un­der risk (e.g., Hans­son 2013, 2018). The long-term view has its own method­olog­i­cal challenges but has pro­duced a wide dis­cus­sion: (Teg­mark 2017) fo­cuses on AI and hu­man life “3.0” af­ter sin­gu­lar­ity while Rus­sell, Dewey, and Teg­mark (2015) and Bostrom, Dafoe, and Flynn (forth­com­ing) sur­vey longer-term policy is­sues in eth­i­cal AI. Sev­eral col­lec­tions of pa­pers have in­ves­ti­gated the risks of ar­tifi­cial gen­eral in­tel­li­gence (AGI) and the fac­tors that might make this de­vel­op­ment more or less risk-laden (Müller 2016b; Cal­laghan et al. 2017; Yam­polskiy 2018), in­clud­ing the de­vel­op­ment of non-agent AI (Drexler 2019).

2.10.3 Con­trol­ling Su­per­in­tel­li­gence?

In a nar­row sense, the “con­trol prob­lem” is how we hu­mans can re­main in con­trol of an AI sys­tem once it is su­per­in­tel­li­gent (Bostrom 2014: 127ff). In a wider sense, it is the prob­lem of how we can make sure an AI sys­tem will turn out to be pos­i­tive ac­cord­ing to hu­man per­cep­tion (Rus­sell 2019); this is some­times called “value al­ign­ment”. How easy or hard it is to con­trol a su­per­in­tel­li­gence de­pends sig­nifi­cantly on the speed of “take-off” to a su­per­in­tel­li­gent sys­tem. This has led to par­tic­u­lar at­ten­tion to sys­tems with self-im­prove­ment, such as AlphaZero (Silver et al. 2018).

One as­pect of this prob­lem is that we might de­cide a cer­tain fea­ture is de­sir­able, but then find out that it has un­fore­seen con­se­quences that are so nega­tive that we would not de­sire that fea­ture af­ter all. This is the an­cient prob­lem of King Mi­das who wished that all he touched would turn into gold. This prob­lem has been dis­cussed on the oc­ca­sion of var­i­ous ex­am­ples, such as the “pa­per­clip max­imiser” (Bostrom 2003b), or the pro­gram to op­ti­mise chess perfor­mance (Omo­hun­dro 2014).

Dis­cus­sions about su­per­in­tel­li­gence in­clude spec­u­la­tion about om­ni­scient be­ings, the rad­i­cal changes on a “lat­ter day”, and the promise of im­mor­tal­ity through tran­scen­dence of our cur­rent bod­ily form—so some­times they have clear re­li­gious un­der­tones (Ca­purro 1993; Geraci 2008, 2010; O’Con­nell 2017: 160ff). Th­ese is­sues also pose a well-known prob­lem of episte­mol­ogy: Can we know the ways of the om­ni­scient (Dana­her 2015)? The usual op­po­nents have already shown up: A char­ac­ter­is­tic re­sponse of an athe­ist is

Peo­ple worry that com­put­ers will get too smart and take over the world, but the real prob­lem is that they’re too stupid and they’ve already taken over the world (Dom­in­gos 2015)

The new nihilists ex­plain that a “techno-hyp­no­sis” through in­for­ma­tion tech­nolo­gies has now be­come our main method of dis­trac­tion from the loss of mean­ing (Gertz 2018). Both op­po­nents would thus say we need an ethics for the “small” prob­lems that oc­cur with ac­tual AI and robotics (sec­tions 2.1 through 2.9 above), and that there is less need for the “big ethics” of ex­is­ten­tial risk from AI (sec­tion 2.10).