Intelligence Explosion analysis draft: Why designing digital intelligence gets easier over time

Again, I invite your feedback on this snippet from an intelligence explosion analysis Anna Salamon and myself have been working on. This section is less complete than the others; missing text is indicated with brackets: [].

_____

Many predictions of human-level digital intelligence have been wrong.1 On the other hand, machines surpass human ability at new tasks with some regularity (Kurzweil 2005). For example, machines recently achieved superiority at visually identifying traffic signs at low resolution (Sermanet and LeCun 2011), diagnosing cardiovascular problems from some types of MRI scan images (Li et al. 2009), and playing Jeopardy! (Markoff 2011). Below, we consider several factors that, considered together, appear to increase the odds that we will develop digital intelligence as the century progresses.

More hardware. For at least four decades, computing power2 has increased exponentially, in accordance with Moore’s law.3 Experts disagree on how much longer Moore’s law will hold (e.g. Mack 2011; Lundstrom 2003), but if it holds for two more decades then we may have enough computing power to emulate human brains by 2029.4 Even if Moore’s law fails to hold, our hardware should become much more powerful in the coming decades.5 More hardware doesn’t by itself give us digital intelligence, but it contributes to the development of digital intelligence in several ways:

Powerful hardware may improve performance simply by allowing existing “brute force” solutions to run faster (Moravec, 1976). Where such solutions do not yet exist, researchers might be incentivized to quickly develop them given abundant hardware to exploit. Cheap computing may enable much more extensive experimentation in algorithm design, tweaking parameters or using methods such as genetic algorithms. Indirectly, computing may enable the production and processing of enormous datasets to improve AI performance (Halevi et al., 2009), or result in an expansion of the information technology industry and the quantity of researchers in the field.6

Massive datasets. The greatest leaps forward in speech recognition and translation software have come not from faster hardware or smarter hand-coded algorithms, but from access to massive data sets of human-transcribed and human-translated words (Halevy, Norvig, and Pereira 2009). [add sentence about how datasets are expected to increase massively, or have been increasing massively and trends are expected to continue] [Possibly a sentence about Watson or usefulness of data for AI]

Better algorithms. Mathematical insights can reduce the computation time of a program by many orders of magnitude without additional hardware. For example, IBM’s Deep Blue played chess at the level of world champion Garry Kasparov in 1997 using about 1.5 trillion instructions per second (TIPS), but a program called Deep Junior did it in 2003 using only 0.15 TIPS. Thus, the power of the chess algorithms increased by a factor of 100 in only six years, or 3.33 orders of magnitude per decade (Richard and Shaw 2004). [add sentence about how this sort of improvement is not uncommon, with citations]

Progress in neuroscience. [neuroscientists have figured out brain algorithms X, Y, and Z that are related to intelligence.] New insights into how the brain achieves human-level intelligence can inform our attempts to build human-level intelligence with silicon (van der Velde 2010; Koene 2011).

Accelerated science. A growing First World will mean that more researchers at well-funded universities will be available to do research relevant to digital intelligence. The world’s scientific output (in publications) grew by a third from 2002 to 2007 alone, much of this driven by the rapid growth of scientific output in developing nations like China and India (Smith 2011). New tools can accelerate particular fields, just as fMRI accelerated neuroscience in the 1990s. Finally, the effectiveness of scientists themselves can potentially be increased with cognitive enhancement drugs (Sandberg and Bostrom 2009) and brain-computer interfaces that allow direct neural access to large databases (Groß 2009). Better collaboration tools like blogs and Google scholar are already yielding results (Nielsen 2011).

Automated science. Early attempts at automated science — e.g., using data mining algorithms to make discoveries from existing data (Szalay and Gray 2006), or having a machine with no physics knowledge correctly infer natural laws from motion-tracking data (Schmidt and Lipson 2009) — were limited by the slowest part of the process: the human in the loop. Recently, the first “closed-loop” robot scientist successfully devised its own hypotheses (about yeast genomics), conducted experiments to test those hypotheses, assessed the results, and made novel scientific discoveries, all without human intervention (King et al. 2009). Current closed-loop robot scientists can only work on a narrow set of scientific problems, but future advances may allow for scalable, automated scientific discovery (Sparkes et al. 2010).

Embryo selection for better scientists. At age 8, Terrence Tao scored 760 on the math SAT, one of only [2?3?] children ever to do this at such an age; he later went on to [have a lot of impact on math]. Studies of similar kids convince researchers that there is a large “aptitude” component to mathematical achievement, even at the high end.7 How rapidly would mathematics or AI progress if we could create hundreds of thousands of Terrence Tao’s? This is a serious question because the creation of large numbers of exceptional scientists is an engineering project that we know in principle how to do. The plummeting costs of genetic sequencing [expected to go below AMOUNT per genome by SOONYEAR e.g. 2015] will soon make it feasible to compare the characteristics of an entire population of adults with those adults’ full genomes, and, thereby, to unravel the heritable components of intelligence, dilligence, and other contributors to scientific achievement. To make large numbers of babies with scientific abilities near the top of the current human range8 would then require only the ability to combine known alleles onto a single genome; procedures that can do this have already been developed for mice. China, at least, appears interested in this prospect.9

It isn’t clear which of these factors will ease progress toward digital intelligence, but it seems likely that — across a broad range of scenarios — some of these inputs will do so.

____

1 For example, Simon (1965, 96) predicted that “machines will be capable, within twenty years, of doing any work a man can do.”

2 The technical measure predicted by Moore’s law is the density of components on an integrated circuit, but this is closely tied to affordable computing power.

3 For important qualifications, see Nagy et al. (2010); Mack (2011).

4 This calculation depends on the “level of emulation” expected to be necessary for successful WBE. Sandberg and Bostrom (2008) report that attendees to a workshop on WBE tended to expect that emulation at the level of the brain’s spiking neural network, perhaps including membrane states and concentrations of metabolites and neurotransmitters, would be required for successful WBE. They estimate that if Moore’s law continues, we will have the computational capacity to emulate a human brain at the level of its spiking neural network by 2019, or at the level of metabolites and neurotransmitters by 2029.

5 Quantum computing may also emerge during this period. Early worries that quantum computing may not be feasible have been overcome, but it is hard to predict whether quantum computing will contribute significantly to the development of digital intelligence because progress in quantum computing depends heavily on unpredictable insights in quantum algorithms (Rieffel and Polak 2011).

6 Shulman and Sandberg (2010).

7 [Benbow etc. on study of exceptional talent; genetics of g; genetics of conscientiousness and openness, pref. w/​ any data linking conscientiousness or openness to scientific achievement. Try to frame in a way that highlights hard work type variables, so as to alienate people less.]

8 [folks with very top scientific achievement likely had lucky circumstances as well as initial gifts (so that, say, new kids with Einstein’s genome would be expected to average perhaps .8 times as exceptional). However, one could probably identify genomes better than Einstein’s, both because these technologies would let genomes be combined that had unheard of, vastly statistically unlikely amounts of luck, and because e.g. there are likely genomes out there that are substantially better than Einstein (but on folks who had worse environmental luck).]

9 [find source]

_____

All references, including the ones used above:

  • Bainbridge 2006 managing nano-bio-info-cogno innovations

  • Baum Goertzel Goertzel 2011 how long until human-level ai

  • Bostrom 2003 ethical issues in advanced artificial intelligence

  • Legg 2008 machine super intelligence

  • Caplan 2008 the totalitarian threat

  • Sandberg & Bostrom 2011 machine intelligence survey

  • Chalmers 2010 singularity philosophical analysis

  • Turing 1950 machine intelligence

  • Good 1965 speculations concerning...

  • Von neumann 1966 theory of self-reproducing autonomata

  • Solomonoff 1985 the time scale of artificial intelligence

  • Vinge 1993 coming technological singularity

  • Yudkowsky 2001 creating friendly ai

  • Yudkowsky 2008a negative and positive factor in global risk

  • Yudkowsky 2008b cognitive biases potentially affecting

  • Russel Norvig 2010 artificial intelligence a modern approach 3e

  • Nordman 2007 If and then: a critique of speculative nanoethics

  • Moore and Healy the trouble with overconfidence

  • Tversky Kahneman 2002 extensional versus intuitive reasoning, the conjunction fallacy

  • Nickerson 1998 Confirmation Bias; A Ubiquitous Phenomenon in Many Guises

  • Dreyfus 1972 what computers can’t do

  • Rhodes 1995 making of the atomic bomb

  • Arrhenius 1896 On the Influence of Carbonic Acid in the Air Upon the Temperature

  • Crawford 1997 Arrhenius’ 1896 model of the greenhouse effect in context

  • Rasmussen 1975 WASH-1400 report

  • McGrayne 2011 theory that would not die

  • Lundstrom 2003 Enhanced: Moore’s law forever?

  • Tversky and Kahneman 1974 Judgment under uncertainty: Heuristics and biases

  • Horgan 1997 end of science

  • Sutton and Barto 1998 reinforcement learning

  • Hutter 2004 universal ai

  • Schmidhuber 2007 godel machines

  • Dewey 2011 learning what to value

  • Simon 1965 The Shape of Automation for Men and Management

  • Marcus 2008 kluge

  • Sandberg Bostrom 2008 whole brain emulation

  • Kurzweil 2005 singularity is near

  • Sermanet Lecun 2011 traffic sign recognition with multi-scale convolutional networks

  • Li et al. 2009 optimizing a medical image analysis system using

  • Markoff 2011 watson trivial it’s not

  • Smith 2011 Knowledge networks and nations

  • Sandberg Bostrom 2009 cognitive enhancement regulatory issues

  • Groß 2009 Blessing or Curse? Neurocognitive Enhancement by “Brain Engineering”

  • Williams 2011 prediction markets theory and appilcations

  • Nielsen 2011 reinventing discovery

  • Tetlock 2005 expert judgment

  • Green & Armstrong 2007 The Ombudsman: Value of Expertise for Forecasting

  • Weinberg et al. 2010 philosophers expert intuiters

  • Szalay and gray 2006 science in an exponential world

  • Schmidt Lipson 2009 distilling free-form natural laws from experimental data

  • King et al. 2009 the automation of science

  • Sparkes et al. 2010 Towards Robot Scientists for autonomous scientific discovery

  • Stanovich 2010 rationality and the reflective mind

  • Lillienfeld, Ammirati, and Landfield 2009 giving debiasing away

  • Lipman 1983 Thinking Skills Fostered by Philosophy for Children

  • Fong et al 1986 The effects of statistical training on thinking about everyday problems

  • Shoemaker (1979). The role of statistical knowledge in gambling decisions

  • Larrick 2004 debiasing

  • Gordon 2007 reasoning about the future of nanotechnology

  • Landeta 2006 Current validity of the delphi method in social sciences

  • Maddison 2001 the world economy a millenial perspective

  • Niparko 2009 cochlear implants principles and practices

  • Bostrom 2002 existential risks

  • Joyce 2007 moral anti-realism stanford encyclopedia of philosophy

  • Portmore 2011 commonsense consequentialism

  • Martin 1971 brief proposal on immortality

  • Bostrom Cirkovic 2008 global catastrophic risks

  • National Academy of Sciences 2010 presistent forecasting of disruptive technologies

  • Donohoe and Needham 2009 Moving best practice forward, Delphi characteristics

  • Gordon 1994 the delphi method

  • Kesten, Armstrong, and Graefe 2007 Methods to Elicit Forecasts from Groups

  • Woudenberg 1991 an evaluation of delphi

  • Armstrong 2006 Findings from evidence-based forecasting

  • Armstrong 1985 Long-Range Forecasting: From Crystal Ball to Computer, 2nd edition

  • Anderson and Anderson-Parente 2011 A case study of long-term Delphi accuracy

  • Bixby 2002 Solving real-world linear programs: A decade and more of progress

  • Fox 2011 the limits of intelligence

  • Friedman 1953 The Methodology of Positive Economics

  • Schneider 2010 homo economicus, or more like Homer Simpsons

  • Cartwright 2011 behavioral economics

  • Bacon and Van Dam 2010 recent progress in quantum algorithms

  • Rieffel Polak 2011 quantum computing a gentle introduction

  • Mack 2011 fifty years of moore’s law

  • Nagy et al. 2010 testing laws of technological progress

  • Lundstrom 2003 Moore’s law forever

  • Shulman Sandberg 2010 implications of a software-limited singularity

  • Moravec 1976 The Role of raw power in intelligence

  • Halevi et al. 2009 The Unreasonable effectiveness of data

  • Alberth 2008 forecasting technology costs via the experience curve

  • Omohundro 2007 the nature of self-improving AI

  • Kurzban 2011 why everyone (else) is a hypocrite: evolution and the modular mind

  • Richard Shaw 2004 chips architectures and algorithms

  • Yudkowsky 2010 timeless decision theory

  • De Blanc Ontological Crises in Artificial Agents’ Value Systems

  • Dewey 2011 learning what to value

  • Halevy, Norvig, and Pereira 2009 the unreasonable effectiveness of data

  • Ramachandran 2011 the tell-tale brain

  • van der Velde 2010 Where Artificial Intelligence and Neuroscience Meet

  • Koene 2011 AGI and neuroscience: Open sourcing the brain (in AGI-11 proceedings)

  • Lichtenstein, Fischoff, and Phillips 1982 calibration of probabilities the state of the art to 1980

  • Griffin and Tversky 1992 The weighing of evidence and the determinants of confidence

  • Yates, Lee, Sieck, Choi, Price 2002 Probability judgment across cultures

  • Murphy and Winkler 1984 probability forecasting in meteorology

  • Grove and Meehl 1996 Comparative Efficiency of Informal...

  • Grove et al. 2000 Clinical versus mechanical prediction: A meta-analysis

  • Kandel et al. 2000 principles of neural science, 4th edition

  • Shulman 2010 Omohundro’s “Basic AI Drives” and Catastrophic Risks

  • Friedman 1993 Problems of Coordination in Economic Activity

  • Cooke 1991 experts on uncertainty

  • Yampolskiy forthcoming Leakproofing the Singularity

  • Lampson 1973 a note on the confinement problem

  • Schaeffer 1997 one jump ahead

  • Dolan and Sharot 2011 Neuroscience of preference and choice