Superintelligence Reading Group 2: Forecasting AI

This is part of a weekly read­ing group on Nick Bostrom’s book, Su­per­in­tel­li­gence. For more in­for­ma­tion about the group, and an in­dex of posts so far see the an­nounce­ment post. For the sched­ule of fu­ture top­ics, see MIRI’s read­ing guide.

Wel­come. This week we dis­cuss the sec­ond sec­tion in the read­ing guide, Fore­cast­ing AI. This is about pre­dic­tions of AI, and what we should make of them.

This post sum­ma­rizes the sec­tion, and offers a few rele­vant notes, and ideas for fur­ther in­ves­ti­ga­tion. My own thoughts and ques­tions for dis­cus­sion are in the com­ments.

There is no need to pro­ceed in or­der through this post. Feel free to jump straight to the dis­cus­sion. Where ap­pli­ca­ble, page num­bers in­di­cate the rough part of the chap­ter that is most re­lated (not nec­es­sar­ily that the chap­ter is be­ing cited for the spe­cific claim).

Read­ing: Opinions about the fu­ture of ma­chine in­tel­li­gence, from Chap­ter 1 (p18-21) and Muehlhauser, When Will AI be Created?


Opinions about the fu­ture of ma­chine in­tel­li­gence, from Chap­ter 1 (p18-21)

  1. AI re­searchers hold a va­ri­ety of views on when hu­man-level AI will ar­rive, and what it will be like.

  2. A re­cent set of sur­veys of AI re­searchers pro­duced the fol­low­ing me­dian dates:

    • for hu­man-level AI with 10% prob­a­bil­ity: 2022

    • for hu­man-level AI with 50% prob­a­bil­ity: 2040

    • for hu­man-level AI with 90% prob­a­bil­ity: 2075

  3. Sur­veyed AI re­searchers in ag­gre­gate gave 10% prob­a­bil­ity to ‘su­per­in­tel­li­gence’ within two years of hu­man level AI, and 75% to ‘su­per­in­tel­li­gence’ within 30 years.

  4. When asked about the long-term im­pacts of hu­man level AI, sur­veyed AI re­searchers gave the re­sponses in the figure be­low (these are ‘renor­mal­ized me­dian’ re­sponses, ‘TOP 100’ is one of the sur­veyed groups, ‘Com­bined’ is all of them’).

  5. There are var­i­ous rea­sons to ex­pect such opinion polls and pub­lic state­ments to be fairly in­ac­cu­rate.

  6. Nonethe­less, such opinions sug­gest that the prospect of hu­man-level AI is wor­thy of at­ten­tion.

When Will AI Be Created?
  1. Pre­dict­ing when hu­man-level AI will ar­rive is hard.

  2. The es­ti­mates of in­formed peo­ple can vary be­tween a small num­ber of decades and a thou­sand years.

  3. Differ­ent time scales have differ­ent policy im­pli­ca­tions.

  4. Sev­eral sur­veys of AI ex­perts ex­ist, but Muehlhauser sus­pects sam­pling bias (e.g. op­ti­mistic views be­ing sam­pled more of­ten) makes such sur­veys of lit­tle use.

  5. Pre­dict­ing hu­man-level AI de­vel­op­ment is the kind of task that ex­perts are char­ac­ter­is­ti­cally bad at, ac­cord­ing to ex­ten­sive re­search on what makes peo­ple bet­ter at pre­dict­ing things.

  6. Peo­ple try to pre­dict hu­man-level AI by ex­trap­o­lat­ing hard­ware trends. This prob­a­bly won’t work, as AI re­quires soft­ware as well as hard­ware, and soft­ware ap­pears to be a sub­stan­tial bot­tle­neck.

  7. We might try to ex­trap­o­late soft­ware progress, but soft­ware of­ten pro­gresses less smoothly, and is also hard to de­sign good met­rics for.

  8. A num­ber of plau­si­ble events might sub­stan­tially ac­cel­er­ate or slow progress to­ward hu­man-level AI, such as an end to Moore’s Law, de­ple­tion of low-hang­ing fruit, so­cietal col­lapse, or a change in in­cen­tives for de­vel­op­ment.

  9. The ap­pro­pri­ate re­sponse to this situ­a­tion is un­cer­tainty: you should nei­ther be con­fi­dent that hu­man-level AI will take less than 30 years, nor that it will take more than a hun­dred years.

  10. We can still hope to do bet­ter: there are known ways to im­prove pre­dic­tive ac­cu­racy, such as mak­ing quan­ti­ta­tive pre­dic­tions, look­ing for con­crete ‘sign­posts’, look­ing at ag­gre­gated pre­dic­tions, and de­com­pos­ing com­plex phe­nom­ena into sim­pler ones.

  1. More (similar) sur­veys on when hu­man-level AI will be de­vel­ope­d
    Bostrom dis­cusses some re­cent polls in de­tail, and men­tions that oth­ers are fairly con­sis­tent. Below are the sur­veys I could find. Sev­eral of them give dates when me­dian re­spon­dents be­lieve there is a 10%, 50% or 90% chance of AI, which I have recorded as ’10% year’ etc. If their find­ings were in an­other form, those are in the last column. Note that some of these sur­veys are fairly in­for­mal, and many par­ti­ci­pants are not AI ex­perts, I’d guess es­pe­cially in the Bain­bridge, AI@50 and Klein ones. ‘Kruel’ is the set of in­ter­views from which Nils Nil­son is quoted on p19. The in­ter­views cover a wider range of top­ics, and are in­dexed here.

    10% year 50% year 90% year Other pre­dic­tions
    Michie 1972
    (pa­per down­load)
    Fairly even spread be­tween 20, 50 and >50 years
    Bain­bridge 2005 Me­dian pre­dic­tion 2085
    AI@50 poll
    82% pre­dict more than 50 years (>2056) or never
    Baum et al
    2020 2040 2075
    Klein 2011
    me­dian 2030-2050
    FHI 2011 2028 2050 2150
    Kruel 2011- (in­ter­views, sum­mary) 2025 2035 2070
    FHI: AGI 2014 2022 2040 2065
    FHI: TOP100 2014 2022 2040 2075
    FHI:EETN 2014 2020 2050 2093
    FHI:PT-AI 2014 2023 2048 2080
    Han­son on­go­ing Most say have come 10% or less of the way to hu­man level

  2. Pre­dic­tions in pub­lic state­ments
    Polls are one source of pre­dic­tions on AI. Another source is pub­lic state­ments. That is, things peo­ple choose to say pub­li­cly. MIRI ar­ranged for the col­lec­tion of these pub­lic state­ments, which you can now down­load and play with (the origi­nal and info about it, my ed­ited ver­sion and ex­pla­na­tion for changes). The figure be­low shows the cu­mu­la­tive frac­tion of pub­lic state­ments claiming that hu­man-level AI will be more likely than not by a par­tic­u­lar year. Or at least claiming some­thing that can be broadly in­ter­preted as that. It only in­cludes recorded state­ments made since 2000. There are var­i­ous warn­ings and de­tails in in­ter­pret­ing this, but I don’t think they make a big differ­ence, so are prob­a­bly not worth con­sid­er­ing un­less you are es­pe­cially in­ter­ested. Note that the au­thors of these state­ments are a mix­ture of mostly AI re­searchers (in­clud­ing dis­pro­por­tionately many work­ing on hu­man-level AI) a few fu­tur­ists, and a few other peo­ple.

    (LH axis = frac­tion of peo­ple pre­dict­ing hu­man-level AI by that date)

    Cu­mu­la­tive dis­tri­bu­tion of pre­dicted date of AI

    As you can see, the me­dian date (when the graph hits the 0.5 mark) for hu­man-level AI here is much like that in the sur­vey data: 2040 or so.

    I would gen­er­ally ex­pect pre­dic­tions in pub­lic state­ments to be rel­a­tively early, be­cause peo­ple just don’t tend to bother writ­ing books about how ex­cit­ing things are not go­ing to hap­pen for a while, un­less their pre­dic­tion is fas­ci­nat­ingly late. I checked this more thor­oughly, by com­par­ing the out­comes of sur­veys to the state­ments made by peo­ple in similar groups to those sur­veyed (e.g. if the sur­vey was of AI re­searchers, I looked at state­ments made by AI re­searchers). In my (very cur­sory) as­sess­ment (de­tailed at the end of this page) there is a bit of a differ­ence: pre­dic­tions from sur­veys are 0-23 years later than those from pub­lic state­ments.

  3. What kinds of things are peo­ple good at pre­dict­ing?
    Arm­strong and So­tala (p11) sum­ma­rize a few re­search efforts in re­cent decades as fol­lows.

    Note that the prob­lem of pre­dict­ing AI mostly falls on the right. Un­for­tu­nately this doesn’t tell us any­thing about how much harder AI timelines are to pre­dict than other things, or the ab­solute level of pre­dic­tive ac­cu­racy as­so­ci­ated with any com­bi­na­tion of fea­tures. How­ever if you have a rough idea of how well hu­mans pre­dict things, you might cor­rect it down­ward when pre­dict­ing how well hu­mans pre­dict fu­ture AI de­vel­op­ment and its so­cial con­se­quences.

  4. Bi­as­es
    As well as just be­ing gen­er­ally in­ac­cu­rate, pre­dic­tions of AI are of­ten sus­pected to sub­ject to a num­ber of bi­ases. Bostrom claimed ear­lier that ‘twenty years is the sweet spot for prog­nos­ti­ca­tors of rad­i­cal change’ (p4). A re­lated con­cern is that peo­ple always pre­dict rev­olu­tion­ary changes just within their life­times (the so-called Maes-Gar­reau law). Worse prob­lems come from se­lec­tion effects: the peo­ple mak­ing all of these pre­dic­tions are se­lected for think­ing AI is the best things to spend their lives on, so might be es­pe­cially op­ti­mistic. Fur­ther, more ex­cit­ing claims of im­pend­ing robot rev­olu­tion might be pub­lished and re­mem­bered more of­ten. More bias might come from wish­ful think­ing: hav­ing spent a lot of their lives on it, re­searchers might hope es­pe­cially hard for it to go well. On the other hand, as Nils Nil­son points out, AI re­searchers are wary of past pre­dic­tions and so try hard to re­tain re­spectabil­ity, for in­stance by fo­cussing on ‘weak AI’. This could sys­tem­at­i­cally push their pre­dic­tions later.

    We have some ev­i­dence about these bi­ases. Arm­strong and So­tala (us­ing the MIRI dataset) find peo­ple are es­pe­cially will­ing to pre­dict AI around 20 years in the fu­ture, but couldn’t find ev­i­dence of the Maes-Gar­reau law. Another way of look­ing for the Maes-Gar­reau law is via cor­re­la­tion be­tween age and pre­dicted time to AI, which is weak (-.017) in the ed­ited MIRI dataset. A gen­eral ten­dency to make pre­dic­tions based on in­cen­tives rather than available in­for­ma­tion is weakly sup­ported by pre­dic­tions not chang­ing much over time, which is pretty much what we see in the MIRI dataset. In the figure be­low, ‘early’ pre­dic­tions are made be­fore 2000, and ‘late’ ones since then.

    Cu­mu­la­tive dis­tri­bu­tion of pre­dicted Years to AI, in early and late pre­dic­tions.

    We can learn some­thing about se­lec­tion effects from AI re­searchers be­ing es­pe­cially op­ti­mistic about AI from com­par­ing groups who might be more or less se­lected in this way. For in­stance, we can com­pare most AI re­searchers—who tend to work on nar­row in­tel­li­gent ca­pa­bil­ities—and re­searchers of ‘ar­tifi­cial gen­eral in­tel­li­gence’ (AGI) who speci­fi­cally fo­cus on cre­at­ing hu­man-level agents. The figure be­low shows this com­par­i­son with the ed­ited MIRI dataset, us­ing a rough as­sess­ment of who works on AGI vs. other AI and only pre­dic­tions made from 2000 on­ward (‘late’). In­ter­est­ingly, the AGI pre­dic­tions in­deed look like the most op­ti­mistic half of the AI pre­dic­tions.

    Cu­mu­la­tive dis­tri­bu­tion of pre­dicted date of AI, for AGI and other AI re­searcher­s

    We can also com­pare other groups in the dataset - ‘fu­tur­ists’ and other peo­ple (ac­cord­ing to our own heuris­tic as­sess­ment). While the pic­ture is in­ter­est­ing, note that both of these groups were very small (as you can see by the large jumps in the graph).

    Cu­mu­la­tive dis­tri­bu­tion of pre­dicted date of AI, for var­i­ous group­s

    Re­mem­ber that these differ­ences may not be due to bias, but rather to bet­ter un­der­stand­ing. It could well be that AGI re­search is very promis­ing, and the closer you are to it, the more you re­al­ize that. Nonethe­less, we can say some things from this data. The to­tal se­lec­tion bias to­ward op­ti­mism in com­mu­ni­ties se­lected for op­ti­mism is prob­a­bly not more than the differ­ences we see here—a few decades in the me­dian, but could plau­si­bly be that large.

    Th­ese have been some rough calcu­la­tions to get an idea of the ex­tent of a few hy­poth­e­sized bi­ases. I don’t think they are very ac­cu­rate, but I want to point out that you can ac­tu­ally gather em­piri­cal data on these things, and claim that given the cur­rent level of re­search on these ques­tions, you can learn in­ter­est­ing things fairly cheaply, with­out do­ing very elab­o­rate or rigor­ous in­ves­ti­ga­tions.

  5. What defi­ni­tion of ‘su­per­in­tel­li­gence’ do AI ex­perts ex­pect within two years of hu­man-level AI with prob­a­bil­ity 10% and within thirty years with prob­a­bil­ity 75%?
    “As­sume for the pur­pose of this ques­tion that such HLMI will at some point ex­ist. How likely do you then think it is that within (2 years /​ 30 years) there­after there will be ma­chine in­tel­li­gence that greatly sur­passes the perfor­mance of ev­ery hu­man in most pro­fes­sions?” See the pa­per for other de­tails about Bostrom and Müller’s sur­veys (the ones in the book).

In-depth investigations

If you are par­tic­u­larly in­ter­ested in these top­ics, and want to do fur­ther re­search, these are a few plau­si­ble di­rec­tions, some taken from Luke Muehlhauser’s list:

  1. In­stead of ask­ing how long un­til AI, Robin Han­son’s mini-sur­vey asks peo­ple how far we have come (in a par­tic­u­lar sub-area) in the last 20 years, as a frac­tion of the re­main­ing dis­tance. Re­sponses to this ques­tion are gen­er­ally fairly low − 5% is com­mon. His re­spon­dents also tend to say that progress isn’t ac­cel­er­at­ing es­pe­cially. Th­ese es­ti­mates im­ply that any given sub-area of AI, hu­man-level abil­ity should be reached in about 200 years, which is strongly at odds with what re­searchers say in the other sur­veys. An in­ter­est­ing pro­ject would be to ex­pand Robin’s sur­vey, and try to un­der­stand the dis­crep­ancy, and which es­ti­mates we should be us­ing. We made a guide to car­ry­ing out this pro­ject.

  2. There are many pos­si­ble em­piri­cal pro­jects which would bet­ter in­form es­ti­mates of timelines e.g. mea­sur­ing the land­scape and trends of com­pu­ta­tion (MIRI started this here, and made a pro­ject guide), an­a­lyz­ing perfor­mance of differ­ent ver­sions of soft­ware on bench­mark prob­lems to find how much hard­ware and soft­ware con­tributed to progress, de­vel­op­ing met­rics to mean­ingfully mea­sure AI progress, in­ves­ti­gat­ing the ex­tent of AI in­spira­tion from biol­ogy in the past, mea­sur­ing re­search in­puts over time (e.g. a start), and find­ing the char­ac­ter­is­tic pat­terns of progress in al­gorithms (my at­tempts here).

  3. Make a de­tailed as­sess­ment of likely timelines in com­mu­ni­ca­tion with some in­formed AI re­searchers.

  4. Gather and in­ter­pret past efforts to pre­dict tech­nol­ogy decades ahead of time. Here are a few efforts to judge past tech­nolog­i­cal pre­dic­tions: Clarke 1969, Wise 1976, Albright 2002, Mul­lins 2012, Kurzweil on his own pre­dic­tions, and other peo­ple on Kurzweil’s pre­dic­tions.

  5. Above I showed you sev­eral rough calcu­la­tions I did. A rigor­ous ver­sion of any of these would be use­ful.

  6. Did most early AI sci­en­tists re­ally think AI was right around the cor­ner, or was it just a few peo­ple? The ear­liest sur­vey available (Michie 1973) sug­gests it may have been just a few peo­ple. For those that thought AI was right around the cor­ner, how much did they think about the safety and eth­i­cal challenges? If they thought and talked about it sub­stan­tially, why was there so lit­tle pub­lished on the sub­ject? If they re­ally didn’t think much about it, what does that im­ply about how se­ri­ously AI sci­en­tists will treat the safety and eth­i­cal challenges of AI in the fu­ture? Some rele­vant sources here.

  7. Con­duct a Delphi study of likely AGI im­pacts. Par­ti­ci­pants could be AI sci­en­tists, re­searchers who work on high-as­surance soft­ware sys­tems, and AGI the­o­rists.

  8. Sign­post the fu­ture. Su­per­in­tel­li­gence ex­plores many differ­ent ways the fu­ture might play out with re­gard to su­per­in­tel­li­gence, but can­not help be­ing some­what ag­nos­tic about which par­tic­u­lar path the fu­ture will take. Come up with clear di­ag­nos­tic sig­nals that policy mak­ers can use to gauge whether things are de­vel­op­ing to­ward or away from one set of sce­nar­ios or an­other. If X does or does not hap­pen by 2030, what does that sug­gest about the path we’re on? If Y ends up tak­ing value A or B, what does that im­ply?

  9. Another sur­vey of AI sci­en­tists’ es­ti­mates on AGI timelines, take­off speed, and likely so­cial out­comes, with more re­spon­dents and a higher re­sponse rate than the best cur­rent sur­vey, which is prob­a­bly Müller & Bostrom (2014).

  10. Down­load the MIRI dataset and see if you can find any­thing in­ter­est­ing in it.

    How to proceed

    This has been a col­lec­tion of notes on the chap­ter. The most im­por­tant part of the read­ing group though is dis­cus­sion, which is in the com­ments sec­tion. I pose some ques­tions for you there, and I in­vite you to add your own. Please re­mem­ber that this group con­tains a va­ri­ety of lev­els of ex­per­tise: if a line of dis­cus­sion seems too ba­sic or too in­com­pre­hen­si­ble, look around for one that suits you bet­ter!

    Next week, we will talk about two paths to the de­vel­op­ment of su­per­in­tel­li­gence: AI coded by hu­mans, and whole brain em­u­la­tion. To pre­pare, read Ar­tifi­cial In­tel­li­gence and Whole Brain Emu­la­tion from Chap­ter 2. The dis­cus­sion will go live at 6pm Pa­cific time next Mon­day 29 Septem­ber. Sign up to be no­tified here.