Contra Yudkowsky on Doom from Foom #2
This is a follow up and partial rewrite to/of an earlier part #1 post critiquing EY’s specific argument for doom from AI go foom, and a partial clarifying response to DaemonicSigil’s reply on efficiency.
AI go Foom?
By Foom I refer to the specific idea/model (as popularized by EY, MIRI, etc) that near future AGI will undergo a rapid intelligence explosion (hard takeoff) to become orders of magnitude more intelligent (ex from single human capability to human civilization capability) - in a matter of only days or hours—and then dismantle humanity (figuratively as in disempower or literally as in “use your atoms for something else”). Variants of this idea still seems important/relevant drivers of AI risk arguments today: Rob Besinger recently says “STEM-capable artificial general intelligence (AGI) is likely to vastly outperform human intelligence immediately (or very quickly).”
I believe the probability of these scenarios is probably small and the current arguments lack technical engineering prowress concerning the computational physics of - and derived practical engineering constraints on—intelligence. Nonetheless these hypothetical scenarios where the AI system fooms suddenly (perhaps during training) appear to be the most obviously dangerous, as they seemingly lead to a “one critical try” situation where humanity can’t learn and adapt from alignment failures.
During the manhattan project some physicists became concerned about the potential of a nuke detonation igniting the atmosphere. Even a small non-epsilon possibility of destroying the entire world should be taken very seriously. So they did some detailed technical analysis which ultimately output a probability below their epsilon allowing them to continue on their merry task of creating weapons of mass destruction.
Today of course there is another option, which you could consider the much more detailed version of analysis: we can (and do) test nuclear weapons in simulations, and a simulation could be used to assess atmosphere ignition risks. This simulation analogy carries over directly as my mainline hope for safely aligning new AGI. But as it currently doesn’t seem the world is coordinating towards the effort to standardize on safe simulation testing, we are left with rough foom arguments and their analysis .
In the ‘ideal’ scenario, the doom foomers (EY/MIRI) would present a detailed technical proposal that could be risk evaluated. They of course have not provided that, and indeed it would seem to be an implausible ask. Even if they were claiming to have the technical knowledge on how to produce a fooming AGI, providing that analysis itself could cause someone to create said AGI and thereby destroy the world![1] In the historical precedent of the manhattan project, the detailed safety analysis only finally arrived during the first massive project that succeeded at creating the technology to destroy the world.
So we are left with indirect, often philosophical arguments, which I find unsatisfying. To the extent that EY/MIRI has produced some technical work related to AGI[2], I find it honestly to be more philosophical than technical, and in the latter capacity more amateurish than expert.
I have spent a good chunk of my life studying the AGI problem as an engineer (neuroscience, deep learning, hardware, GPU programming, etc), and reached the conclusion that fast FOOM is possible but unlikely. Proving that of course is very difficult, so I instead gather much of the evidence that led me to that conclusion. However I can’t reveal all of the evidence, as the process is rather indistinguishable from searching for the design of AGI itself.[3]
The valid technical arguments for/against the Foom mostly boils down to various efficiency considerations.
Quick background: pareto optimality/efficiency
Engineering is complex and full of fundamental practical tradeoffs:
larger automobiles are safer via higher mass, but have lower fuel economy
larger wings produce more lift but also more drag at higher speeds
highly parallel circuits can do more total work per clock and are more energy efficient but the corresponding parallel algorithms are more complex to design/code, require somewhat more work to accomplish a task, delay/latency becomes more problematic for larger circuits, etc
adiabatic and varying degrees of reversible circuit designs are possible but they are slower, larger, more complex, less noise tolerant, and still face largely unresolved design challenges with practical clock synchronization, etc
quantum computers are possible but are enormously challenging to scale to useful size, are incredibly sensitive to noise/heat, and don’t necessarily provide useful speedup for most problems of interest
Pareto optimality is when a design is on a pareto surface such that no improvement in any key variable/dimension of interest is possible without sacrifice in some other important variable/dimension.
In a nascent engineering field solutions start far from the pareto frontier and evolve towards it, so in the early days strict true efficiency improvements are possible: improvements in variables of interest without any or much sacrifice in other variables of interest. But as a field matures these low hanging fruit are tapped out, and solutions evolve towards the pareto frontier. Moore’s Law is the canonical example of starting many OOM from the pareto frontier and steadily relentlessly climbing towards it, year after year.
The history of video game evolution provides an interesting case history in hardware/software coevolution around pareto frontiers. Every few years the relentless march of moore’s law produced a new hardware platform with several times greater flops/bandwidth/RAM/etc which could be used to just run last generation algorithms faster, but is often best used to run new algorithms. In short each new hardware generation partially reset the software pareto frontier. The key lesson here is the software lag was/is short, and it does not take humans decades to explore the space of what is possible with each new hardware generation.
The potential of the old Atari hardware was fully maxed out long ago: no engineer, no matter how clever, is at all likely to find a way to run unreal engine 5 on even a Geforce 2, let alone an Atari 2600.
EY/MIRI also rely on the claim that human brains are “riddled with cognitive biases” that AGI will not have. I am skeptical of the strong cognitive biases claims and have argued that they stem from a flawed and now discredited theory of the brain. Regardless it is rather obvious that these so called cognitive biases did not prevent programmers like John Carmack from rather quickly reaching the software pareto frontier for each new hardware generation. Moreover, to the extent cognitive biases are real, the AGI we actually have simply reproduces them, because we train AI on human thoughts, distilling human minds: humans in produces humans out.[4] I predicted this in advance and the evidence continues to pile up for my position.
Efficiency drives intelligence
Intelligence for our purposes—the kind of intelligence AI doomers worry about—is dangerous because it provides capacity to optimize the world. We could specifically quantize intelligence power as the mutual information between an agent’s potential current actions and future observable states of the world, let’s denote that .
High intelligence power requires high computational power, because high mutual information between potential current actions and future observable states only comes from modeling/predicting the future consequences of current actions. This in turn requires approx bayesian inference over observations to learn a powerful efficient model of the world—ie a computationally expensive learning/training process. This is always necessarily some efficient scalable (and thus highly parallel) approximation to solomonoff induction (and in practice, the useful approximations always end up looking like neural networks ).
Foom thus requires an AGI to rapidly acquire many OOM increase in some combination of compute resources (flops, watts), software efficiency in intelligence per unit compute ( /flop ) or hardware efficiency (flops/J or flops/$), as the total intelligence will be limited by something like:
= min(/flop * flop/J *J, /flop * flop/$ *$)
Most of the variance around feasibility of very rapid OOM improvement seems to be in software efficiency, but let’s discuss hardware first.
EY on brain efficiency and the scope for improvement
EY believes the brain is inefficient by about 6 OOM:
Which brings me to the second line of very obvious-seeming reasoning that converges upon the same conclusion—that it is in principle possible to build an AGI much more computationally efficient than a human brain—namely that biology is simply not that efficient, and especially when it comes to huge complicated things that it has started doing relatively recently.
ATP synthase may be close to 100% thermodynamically efficient, but ATP synthase is literally over 1.5 billion years old and a core bottleneck on all biological metabolism. Brains have to pump thousands of ions in and out of each stretch of axon and dendrite, in order to restore their ability to fire another fast neural spike. The result is that the brain’s computation is something like half a million times less efficient than the thermodynamic limit for its temperature—so around two millionths as efficient as ATP synthase. And neurons are a hell of a lot older than the biological software for general intelligence!
The software for a human brain is not going to be 100% efficient compared to the theoretical maximum, nor 10% efficient, nor 1% efficient, even before taking into account the whole thing with parallelism vs. serialism, precision vs. imprecision, or similarly clear low-level differences.
I see two main ways to interpret this statement: EY could be saying 1.) that the brain is ~6 OOM from a pareto optimality frontier, or 2.) that the brain is ~6 OOM from the conservative thermodynamic limit for hypothetical fully reversible computers.
The last paragraph in particular suggests EY believes something more like 1.) - that it is possible to build something as intelligent as the brain that uses 6 OOM less energy, without any ridiculous tradeoffs in size, speed, etc. I believe that is most likely what he meant, and thus he is mistaken.
If the brain performs perhaps 1e14 analog synaptic spike ops/s in 10W, improving that by 6 OOM works out to just 1eV per synaptic spike op[5] - below the practical landauer bound for reliable irreversible computation given what it does. A hypothetical fully reversible computer could likely achieve that nominal energy efficiency, but all extant research indicates it would necessarily make various enormous tradeoffs somewhere: size (ex optical computers have fully reversible interconnect but are enormous), error resilience/correction, exotic/rare expensive materials, etc and the requirement for full reversible logic induces much harder to quantify but probably very limiting constraints on the types of computations you can even do (quantum computers are reversible computers that additionally exploit coherence, and quantum computation does not provide large useful speedup for all useful algorithms).
So either EY believes 1.) the brain is just very far from the pareto efficiency frontier—just not very well organized given its design constraints—in which case he is uninformed, or 2.) that the brain is near some pareto efficiency frontier but very far from the thermodynamic limits for theoretical reversible computers. If interpretation 2 is correct then he essentially agrees with me which undermines the doom argument regardless.
The fact that the brain is OOM from the conservative theoretical limits for thermodynamic efficiency does not imply it is overall inefficient as a computational hardware for intelligence, at least in how I or many would use the term—anymore than the fact that your car being far from the hard limit for areodynamic efficiency or the speed of light implies it is overall inefficient as a transportation vehicle.
Just preceding the 6 OOM claim, EY provides a different naive technical argument as to why he is confident that it is possible to create a mind more powerful than the human brain using much less compute:
Since modern chips are massively serially faster than the neurons in a brain, and the direction of conversion is asymmetrical, we should expect that there are tasks which are immensely expensive to perform in a massively parallel neural setup, which are much cheaper to do with serial processing steps, and the reverse is not symmetrically true.
A sufficiently adept builder can build general intelligence more cheaply in total operations per second, if they’re allowed to line up a billion operations one after another per second, versus lining up only 100 operations one after another. I don’t bother to qualify this with “very probably” or “almost certainly”; it is the sort of proposition that a clear thinker should simply accept as obvious and move on.
A modern GPU or large CPU contains almost 100 billion transistors (and the cerebras wafer chip contains trillions). A pure serial processor is—by definition—limited to executing only a single instruction per clock cycle, and thus unnecessarily wastes the vast potential of a sea of circuitry. The pure parallel processor instead can execute billions of operations per clock cycle.
Serial programming is a convenient myth, a facade used to ease programming. Physics in practice only permits efficient parallel computation. A huge 100 billion circuit can—in one cycle—simulate serial computation to run one op of a javascript program, or it could perform tens perhaps hundreds of thousands of low precision flops in tensorcores, or billions of operations in a neuromorphic configuration.
There is a reason Nvidia eclipsed Intel in stock price, and as I predicted long ago moore’s law obviously becomes increasingly parallel over time.
The DL systems which are actually leading to AGI—as I predicted (and EY did not) - are in fact all GPU simulations of brain-inspired low depth highly parallel circuits. Transformers are not even recurrent, and in that sense are shallower than the brain.
Rapid hardware leverage probably requires nanotech
The lead time for new GPUs is measured in months or years, not days or weeks, and high end CMOS tech is approaching a pareto frontier regardless.
Many OOM increase in compute also mostly rules out scaling up through GPU rental or hacking operations, because the initial AGI training itself will likely already require a GPT4 level or larger supercomputer[6], and you can’t hack/rent your way out to a many OOM larger supercomputer because it probably doesn’t exist, and systems of this scale are extensively monitored regardless. The difference in compute between my home GPU rig and the world’s largest supercomputers is not quite 4 OOM.
So EY puts much hope in nanotech, which is a completely forlorn hope, because nanotech is probably mostly a pipe dream, biological cells are already pareto optimal nanobots, and brains are already reasonably pareto-efficient in terms of the kind of intelligence you can build from practical nanobots (ie bio cells). Don’t mistake substitute any of these arguments for their strawmen: this doesn’t mean brains are near conservative thermodynamic energy efficient limits which only apply to future exotic reversible computers, this doesn’t mean that the human brain is perfectly optimized for intelligence, etc.
Instead it simply means that the nanotech path is very unlikely to result in the required many OOM in a short amount of time. The types of nanotech that are most viable are very similar to biology so you just end up with something that looks like a million vat-brains in a supercomputer, but the kind of brains you can build out of that toolkit sacrifice speed for energy efficiency and so would take years/decades to learn/train—useless for foom.
Bounding hardware foom (without software improvement)
The brain is efficient, so absent many OOM from software (bounded in the next section), the requisite many OOM must come from hardware. As nanotech is infeasible, and foundry ramp up takes much longer than the weeks/months of foom, any many OOM rapid ramp up from hardware must come from rapid acquisition/control of current hardware (ie GPUs).
Foom results from recursive self improvement which requires that the AGI design a better initial architectural prior and or learning algorithms and then run a new training cycle. So we can bound a step of RSI by bounding compute requirements for retraining cycles.[7]
Nvidia dominates the AI hardware landscape and produces only a few 100k high end GPUs suitable for AI per year[8], and they depreciate in a few years, so the entire pool of high end GPUs is less than 1M. If the brain is reasonably efficient then training just human-level AGI probably requires 1e24 to 1e26 flops[9]. Even if an AGI gained control of all 1M GPUs somehow, this would only produce about 1e26 flops per day, or about 4e24 flops per hour, which puts a bound on the duration of the first cycle of recursion. To reach the next level of capability, it will then need to expend 10x compute (or whatever your assumed growth factor is—the gap between GPTNs seems to be 100x).
So if human-level requires 1e26 flops training, using all the world’s compute doesn’t quite achieve 1 level above human in a week[10]. But if human-level requires only 1e24 flops, then perhaps 3 levels above human can be achieved in a week.
I put a low probability on the specific required combination of:
an initial near human-level AGI having the funds/hacks to acquire a very large fraction of GPU compute for even a day (I estimate rental liquidity at 10%, and cost of renting all flagship GPUs is over $1M/hour, $20M/day )
a huge design advantage over what other teams are already doing
distributed training across all of earth not having major disadvantages compared to centralized training
human-level AGI training at the lower end (1e24 flops or less)
Without 3 in particular—highly efficient distributed training—the max useful compute be 10x to 100x less and thus minimal recursion cycle time will be 10x to 100x longer.
Obviously as moore’s law continues that multiplies GPU power per year, eventually noticeably shifting these estimates. However Moore’s law is already soon approaching the limits of miniaturization, and regardless every year of hardware increase without foom also raises the bar for foom by increasing the (AI assisted) capabilities of our human/machine civilization.
So the core uncertainty boils down to how much compute does it require to surpass humans?
The human brain has perhaps 10TB of size/capacity, around ~1e14 sparse synaptic ops/s throughput (equivalent to perhaps 1e16 dense flops/s) , and a 1e9s (32 year) training cycle—so roughly (1e23/1e25 flops, 1e23/1e25 memops). A high end GPU has 100GB capacity, 1e15 dense flop/s but only ~1e12 memops/s. Thus I estimate the equivalent ANN training compute at 1e24 to 1e26 flops equivalent on GPUs (variance depending mostly on important of mem bandwidth and alu/mem ratio interacting with software/arch designs). The largest general ANNs trained so far like GPT4 have used perhaps (1e25 flops, 1e22 memops) and achieve proto-AGI.
Obviously architecture/algorithms determine whether a 1e25 flops computation results in an AGI or a weather simulation or noise, but the brain lifetime net training compute sets expectations for successful training runs.
Many OOM sudden increase in software efficiency unlikely
The software for a human brain is not going to be 100% efficient compared to the theoretical maximum, nor 10% efficient, nor 1% efficient, even before taking into account the whole thing with parallelism vs. serialism, precision vs. imprecision, or similarly clear low-level differences.
Evolution has ran vast experiments for hundreds of millions of years and extensively explored the design space of computational circuits for intelligence. It found similar general solutions again and again across multiple distant lineages. Human researchers have now copied/replicated much of that exploratory search at higher speed, and (re)discovered the same set of universal solutions: approximate bayesian inference using neural networks.
Thus the prior that there is some dramatically better approach sufficient to suddenly provide many OOM improvement is now low. Giant inscrutable matrices is probably about as good as it gets, as I predicted a while ago.
A many OOM sudden increase in software efficiency requires a rare isolated incredibly difficult to find region in design space containing radically different designs that are still fully general but also many OOM more efficient on current hardware—hardware increasingly optimized for the current paradigm.
Intelligence requires/consumes compute in predictable ways, and progress is largely smooth.
Every year that passes without foom is further evidence against its possibility, as we advance ever closer to the vast expanse of the pareto frontier. Every year of further smooth progress exploring the algorithmic landscape we gather more evidence that the big many-OOM better design is all that much harder to find, all while the requisite bar to vastly outcompete our increasingly capable human/AI cyborg civilization rises.
On Biology
biology is simply not that efficient, and especially when it comes to huge complicated things that it has started doing relatively recently.
Biology has been doing neural networks for half a billion years, so EY’s primary argument for the FOOM here is the claim that biology/evolution is just not that efficient.
Biology is quite efficient, for any reasonable meaning of efficient. Here are a few interesting examples (the first two cherrypicked in the adversarial sense that they have been brought up before here as evidence of inefficiency of evolution):
The best (impractical) research solar cells reach 47.6% efficiency[11], compared to 46.8% for this (also probably impractical) Chlorella biomass experiment.
The inverted retina, often claimed as evidence of evolutionary optimization failure, is in fact superior or at least equally effective to the everted retina[12], and is limited by the physics of light regardless.
-
Biological cells are highly efficient as physical nanobots, operating near thermodynamic limits for most key operations such as replication
-
Lest anyone has forgotten, the brain is generally efficient
About AlphaZero
In AGI Ruin, EY uses AlphaZero as a more specific example of the potential for large software efficiency advantage of AGI:
Alpha Zero blew past all accumulated human knowledge about Go after a day or so of self-play, with no reliance on human playbooks or sample games.
AlphaZero’s Go performance predictably eclipsed humans and then its predecessor AlphaGo zero when it had trained (using 5000 TPUs) on far more Go games than any expert human lifetime’s. Like other largescale DL systems, it shows zero advantage over the human brain in terms of data efficiency in virtual/real experience consumed, and achieves higher capability by training on vastly more data.
Go is extremely simple: the entire world of Go can be precisely predicted by trivial tiny low depth circuits/programs. This means that the Go predictive capability of a NN model as a function of NN size completely flatlines at an extremely small size. A massive NN like the brain’s cortex is mostly wasted for Go, with zero advantage vs the tiny NN AlphaZero uses for predicting the tiny simple world of Go.
Games like Go or chess are far too small for a vast NN like the brain, so the vast bulk of its great computational power is wasted. The ideal NN for these simple worlds is very small and very fast—like AlphaZero. So for these domains the ANN system has a large net efficiency advantage over the brain.
The real world is essentially infinitely vaster and more complex than Go, so the model scaling has no limit in sight—ever larger NNs result in ever more capable predictive models, bounded only by the data/experience/time/compute required to train them effectively. The brain’s massive size is ideally suited for modeling the enormous complexity of the real world. So when we apply the same general NN techniques to the real world—via LLMs or similar—we see that even when massively scaled up on enormous supercomputers to train with roughly similar compute than that used by the brain during a lifetime[13], on orders of magnitude more data—the resulting models are only able to capture some of human intelligence; they are not yet full AGI. Obviously AGI is close, but will require a bit more compute and/or efficiency.
There are and will continue to be many specialist subsystems NNs (alphacode, alphafold, stable diffusion, etc) trained on specific subdomains that greatly exceed human performance through using specialized smaller models trained on far more data, but general performance in the real world is the key domain for which huge NNs like the brain are uniquely suited.
This has nothing to do with the brain’s architectural prior, it’s just a relation on how compute is invested in size vs speed and the resulting scaling functions with respect to world complexity.
Seeking true Foom
In some sense the Foom already occurred—it was us. But it wasn’t the result of any new feature in the brain—our brains are just standard primate brains, scaled up a bit[14] and trained for longer.
Human intelligence is the result of a complex one time meta-systems transition: brains networking together and organizing into families, tribes, nations, and civilizations through language. Animal brains learn for a lifetime then die without transmission, humans are turing universal generalists with cultural programming. Humans are in fact not much smarter than apes sans culture/knowledge. That transition only happens once—there are not ever more and more levels of universality or linguistic programmability. AGI does not FOOM again in the same way.
As I and others predicted[15], AGI will be (and already is) made from the same stuff as our minds, literally trained on externalized human thoughts, distilling human mindware via brain-inspired neural networks trained with massive compute on one to many lifetimes of internet data. The post training/education capability of such systems is a roughly predictable function of net training compute.
AGI systems are fundamentally different from humans in a few key respects:
By using enormous resources, they can operate much faster than us (and indeed transformer based LLMs already are trained with many thousand-fold time acceleration), however this requires staying in the ultra-parallel low circuit depth regime, constraining AGI to brain-like designs.
They are potentially immortal and can continue to grow and absorb knowledge indefinetly
These two main differences will lead to enormous transformation, but probably not the foom Yudkowsky has expected for 20 some years, which largely seems to be a continuation of his rather miscalibrated model of nanotech:
bribes/persuades some human who has no idea they’re dealing with an AGI to mix proteins in a beaker, which then form a first-stage nanofactory which can build the actual nanomachinery.[16]
The analysis in the hardware section leaves open the possibility for some forms of foom, especially if we see signs:
replicating brain performance requires the lower end of compute estimates
there are large breakthroughs in decentralized training
large increases in global GPU/accelerator liquidity
increase in the pace of Moore’s Law (rather than the expected decrease)
Part of my intent in writing this posts is a call for better arguments/analysis. A highly detailed plan for replicating brain performance with less than 1e24 training flops is obviously not something to research/discuss in public, but surely there are better public arguments/analysis for foom that don’t noticeably make it more likely.
- ↩︎
From what I can tell, EY/MIRI’s position has consistently been that aligning an AGI is much more difficult than creating it, and they are consistent in claiming that nobody has figured out alignment yet, including them.
- ↩︎
The most relevant work that immediately came to mind is LOGI from 2007.
- ↩︎
I have found potential for perhaps a few OOM improvement here or there, but it looks like tapping much of that is necessary just to reach powerful AGI at all.
- ↩︎
- ↩︎
A synaptic op is minimally both a memory read, a low/medium precision MAC, and a memory write.
- ↩︎
The computation required to train AGI is more of a software efficiency question, discussed in the next section.
- ↩︎
Weights are downstream dependent on architectural prior and or learning algorithms.
- ↩︎
Inferred from annual revenue of $27B, price tag of about ~$27k per flagship GPU, and revenue product split.
- ↩︎
GPT3 uses 3e24 flops, and GPT may have used 1e25 flops. I estimate the equivalent brain lifetime training flops to be around 1e24 to 1e26 flops with the range uncertainty vaguely indicating the impact of software efficiency. Naturally flops is not the only relevant computational metric.
- ↩︎
1e15 flops/gpu * 1e6 gpus * 6e5s = 6e26 flops
- ↩︎
Impractical in the sense that they aren’t optimal for manufacturing or other reasons. A more typical mass manufactured solar cells has ~20% conversion efficiency, similar to chlorphyll conversion ~28% to ATP/NADPH, before conversion to glucose for long term chemical storage.
- ↩︎
- ↩︎
The human brain uses vaguely 1e25 flops-equivalent for net training compute, comparable to estimates for GPT4 net training compute.
- ↩︎
- ↩︎
Moravec and Hanson both predicted in different forms—well in advance—smooth progress to brain-like AGI driven by compute scaling.
- ↩︎
From AGI ruin
- Brain Efficiency Cannell Prize Contest Award Ceremony by 24 Jul 2023 11:30 UTC; 145 points) (
- Where do you lie on two axes of world manipulability? by 26 May 2023 3:04 UTC; 30 points) (
- 30 Apr 2023 0:33 UTC; 25 points) 's comment on Accuracy of arguments that are seen as ridiculous and intuitively false but don’t have good counter-arguments by (
- 27 Apr 2023 6:07 UTC; 7 points) 's comment on Transcript and Brief Response to Twitter Conversation between Yann LeCunn and Eliezer Yudkowsky by (
- 13 Dec 2023 20:50 UTC; 6 points) 's comment on Significantly Enhancing Adult Intelligence With Gene Editing May Be Possible by (
- 17 May 2023 3:29 UTC; 3 points) 's comment on AI Will Not Want to Self-Improve by (
- 20 Oct 2023 20:23 UTC; 2 points) 's comment on I Would Have Solved Alignment, But I Was Worried That Would Advance Timelines by (
Kudos on taking the time to tighten and restate your argument! I’d like to encourage more of this from alignment researchers, it seems likely to save lots of time taking past each other and getting mired on disagreements over non-cruxy points.
+1
Isn’t going from an average human to Einstein a huge increase in science-productivity, without any flop increase? Then why can’t there be software-driven foom, by going farther in whatever direction Einstein’s brain is from the average human?
Science/engineering is often a winner-take all race. To him who has is given more—so for every Einstein there are many others less well known (Lorentz, Minkowski), and so on. Actual ability is filtered through something like a softmax to produce fame, so fame severely underestimates ability.
Evolution proceeds by random exploration of parameter space, the more intelligent humans only reproduce a little more than average in aggregation, and there is drag due to mutations. So the subset of the most intelligent humans represents the upper potential of the brain, but it clearly asymptotes.
Finally, intelligence results from the interaction of genetics and memetics, just like in ANNs.
Digital minds can be copied easily (well at least current ones—future analog neuromorphic minds may be more difficult to copy), so it seems likely that they will not have the equivalent of the mutation load issue as much. On the other hand the great expense of training digital minds and the great cost of GPU RAM means they have much less diversity—many instances of a few minds.
None of this by itself leaves much hope for foom.
In your view, who would contribute more to science -- 1000 Einsteins, or 10,000 average scientists?[1]
“IQ variation is due to continuous introduction of bad mutations” is an interesting hypothesis, and definitely helps save your theory. But there are many other candidates, like “slow fixation of positive mutations” and “fitness tradeoffs[2]”.
Do you have specific evidence for either:
Deleterious mutations being the primary source of IQ variation
Human intelligence “plateauing” around the level of top humans[3]
Or do you believe these things just because they are consistent with your learning efficiency model and are otherwise plausible?[4]
Maybe you have a very different view of leading scientists than most people I’ve read here? My picture here is not based on any high-quality epistemics (e.g. it includes “second-hand vibes”), but I’ll make up some claims anyway, for you to agree or disagree with:
There are some “top scientists” (like Einstein, Dirac, Von Neumann, etc). Within them, much of the variance in fame is incidental, but they are clearly a class apart from merely 96th percentile scientists. 1000 {96%-ile-scientists} would be beaten by 500 {96%-ile-scientists} + 100 Einstein-level scientists.
Even within “top scientists” in a field, the best one is more than 3x as intrinsically productive[5] as the 100th best one.
I’m like 90% on the Einsteins for theoretical physics, and 60% on the Einsteins for chemistry
Within this, I could imagine anything from “this gene’s mechanism obviously demands more energy/nutrients” to “this gene happens to mess up some other random thing, not even in the brain, just because biochemistry is complicated”. I have no idea what the actual prevalence of any of this is.
What does this even mean? Should the top 1/million already be within 10x of peak productivity? How close should the smartest human alive be to the peak? Are they nearly free of deleterious mutations?
I agree that they are consistent with each other and with your view of learning efficiency, but am not convinced of any of them.
“intrinsic” == assume they have the same resources (like lab equipment and junior scientists if they’re experimentalists)
I vaguely agree with your 90%/60% split for physics vs chemistry. In my field of programming we have the 10x myth/meme, which I think is reasonably correct but it really depends on the task.
For the 10x programmers it’s some combination of greater IQ/etc but also starting programming earlier with more focused attention for longer periods of time, which eventually compounds into the 10x difference.
But it really depends on the task distribution—there are some easy tasks where the limit is more typing speed and compilation, and at the extreme there are more theoretical tasks that require some specific combination of talent, knowledge, extended grind focus for great lengths of time, and luck.
Across all fields combined there seem to be perhaps 1000 to 10000 top contributors? But it seems to plateau in the sense that I do not agree that John Von Neumman (or whoever your 100x candidate is) was 10x einstein or even Terrence Tao or Kasparov (or that either would be 10x carmack in programming, if that was their field), and given that there have been 100 billion humans who have ever lived and most lived a while ago, there should have been at least a few historical examples 10x or 100x John Von Neumman. I dont see evidence for that at all.
I do think people here hero worship a bit and overestimate the flatness of the upper tail of the genetic component of intelligence in particular (ie IQ) and its importance.
But that being said your vibe numbers don’t seem so out of whack.
Interesting, I find what you are saying here broadly plausible, and it is updating me (at least toward greater uncertainity/confusion). I notice that I don’t expect the 10x effect, or the Von Neumann effect, to be anywhere close to purely genetic. Maybe some path-dependency in learning? But my intuition (of unknown quality) is that there should be some software tweaks which make the high end of this more reliably achievable.
Anyway, to check that I understand your position, would this be a fair dialogue?:
I find your view more plausible than before, but don’t know what credence to put on it. I’d have more of a take if I properly read your posts.
I’m not sure how to operationalize this “30x-ing” though. Some candidates:
- “1000 scientists + 30 Von Neumanns” vs. “1000 scientists + 1 ASI”
- “1 ASI” vs. “30 Von Neumanns”
- “100 ASIs” vs. “3000 Von Neumanns”
Your model of my model sounds about right, but I also include neotany extension of perhaps 2x which is part of the scale up (spending longer on training the cortex, especially in higher brain regions).
For Von Neumann in particular my understanding is he was some combination of ‘regular’ genius and a mentant (a person who can perform certain computer like calculations quickly), which was very useful for many science tasks in an era lacking fast computers and software like mathematica, but would provide less of an effective edge today. It also inflated people’s perception of his actual abilities.
IIRC according to gwern the theory that IQ variation is mostly due to mutational load has been debunked by modern genomic studies [though mutational load definitely has a sizable effect on IQ]. IQ variation seems to be mostly similar to height in being the result of the additive effect of many individual common allele variations.
I am usually thinking of foom mostly based on software efficiency, and I am usually thinking of the following rather standard scenario. I think this is not much of an infohazard as many people thought and wrote about this.
OpenAI or DeepMind create an artificial AI researcher with software engineering and AI research capabilities on par with software engineering and AI research capabilities of human members of their technical staff (that’s the only human equivalence that truly matters). And copies of this artificial AI researcher can be created with enough variation to cover the diversity of their whole teams.
This is, obviously, very lucrative (increases their velocity a lot), so there is tremendous pressure to go ahead and do it, if it is at all possible. (It’s even more lucrative for smaller teams dreaming of competing with the leaders.)
And, moreover, as a good part of the subsequent efforts of such combined human-AI teams will be directed to making next generations of better artificial AI researchers, and as current human-level is unlikely to be the hard ceiling in this sense, this will accelerate rapidly. Better, more competent software engineering, better AutoML in all its aspects, better ideas for new research papers...
Large training runs will be infrequent; mostly it will be a combination of fine-tuning and composing from components with subsequent fine-tuning of the combined system, so a typical turn-around will be rapid.
Stronger artificial AI researchers will be able to squeeze more out of smaller better structured models; the training will involve smaller quantity of “large gradient steps” (similar to how few-shot learning is currently done on the fly by modern LLMs, but with results stored for future use) and will be more rapid (there will be pressure to find those more efficient algorithmic ways, and those ways will be found by smarter systems).
Moreover, the lowest-hanging fruit is not even in an individual performance, but in the super-human ability of these individual systems to collaborate (humans are really limited by their bandwidth in this sense, they can’t know all the research papers and all the interesting new software).
It’s possible that the “foom” is “not too high” for reasons mentioned in this post (in any case, it is difficult to extrapolate very far), but it’s difficult to see what would prevent at least several OOMs improvement in research capability and velocity of an organization which could pull this off before something like this saturates.
Yes, these artificial systems will do a good deal of alignment and self-alignment too, just so that the organizations stay relatively intact and its artificial and human members keep collaborating.
(Because of all this my thinking is: we absolutely do need to work on safety of fooming, self-improving AI ecosystems; it’s not clear if those safety properties should be expressed in terms of alignment or in some other terms (we really should keep open minds in this sense), but the chances of foom seem to me to be quite real.)
The first 4 paragraphs sound almost like something I would write and I agree up to:
We currently have large training runs for a few reasons, but the most important is that GPT training is very easy to parallelize on GPUs, but GPT inference is not. This is a major limitation because it means GPUs can only accelerate GPT training on (mostly human) past knowledge, but aren’t nearly as efficient at accelerating the rate at which GPT models accumulate experience or self-knowledge.
So if that paradigm continues, large training runs continue to be very important as that is the only way these models can learn new long term knowledge and expand their crystallized intelligence (which at this point is their main impressive capability).
The brain is considered to use more continual learning—but really it just has faster cycles and shorter mini-training runs (via hippocampal replay during sleep). If we move to that kind of paradigm then the training is still very important, but is now just more continuous.
I think we can fine-tune on GPU nicely (fine-tuning is similar to short training runs and results in long-term crystallized knowledge).
But I do agree that the rate of progress here does depend on our progress in doing less uniform things faster (e.g. there are signs of progress in parallelization and acceleration of tree processing (think trees with labeled edges and numerical leaves, which are essentially flexible tensors), but this kind of progress is not mainstream yet, and is not common place yes, instead one has to look at rather obscure papers to see those accelerations of non-standard workloads).
I think this will be achieved (in part, because I somehow do expect less of “winner takes all” dynamics in the field of AI which we have currently; Transformers lead right now, so (almost) all eyes are on Transformers, other efforts attract less attention and resources; with artificial AI researchers not excessively overburdened by human motivations of career and prestige, one would expect better coverage of all possible directions of progress, less crowding around “the winner of the day”).
“Work on the safety of an ecosystem made up of a large number of in-some-ways-superhuman-and-in-other-ways-not AIs” seems like a very different problem than “ensure that when you build a single coherent, effectively-omniscient agent, you give it a goal that does not ruin everything when it optimizes really hard for that goal”.
There are definitely parallels between the two scenarios, but I’m not sure a solution for the second scenario would even work to prevent an organization of AIs with cognitive blind spots from going off the rails.
My model of jacob_cannell’s model is that the medium-term future looks something like “ad-hoc organizations of mostly-cooperating organizations of powerful-but-not-that-powerful agents, with the first organization to reach a given level of capability being the one that focused its resources on finding and using better coordination mechanisms between larger numbers of individual processes rather than the one that focused on raw predictive power”, and that his model of Eliezer goes “no, actually focusing on raw predictive power is the way to go”.
And I think the two different scenarios do in fact suggest different strategies.
Yes, [Mishka’s description of relatively-slow-foom] matches my point of view as well. When I say that I believe recursive self-improvement can and probably will happen in the next few years, this is what I’m pointing at. I expect the first few generations to each take a few months and be a product of humans and AI systems working together, and that the generational improvements will be less than 2x improvements. I expect that there is perhaps 1 − 3 OOMs of improvement in software alone before getting blocked by needing slow expensive hardware changes. So, the scenario I’m concerned about looks more like a 2 OOM (+/- 1) improvement over 6 −12 months. This is a very different scenario than the 4+ OOM improvement in the first few days of the process beginning which is described in some foom-doom stories.
I agree; a relatively slow “foom” is likely; moreover, the human team(s) doing that will know that this is exactly what they are doing, a “slowish” foom (for 2 OOM (+/-1) per 6-12 months; still way faster than our current rate of progress).
Whether this process can unexpectedly run away from them and explode really fast instead at some point would depend on whether completely unexpected radical algorithmic discoveries will be made in the process (that’s one thing the whole ecosystem of humans+AIs in an organization like that should watch for; they need to have genuine consensus among involved humans and involved AIs to collectively ponder such things before allowing them to accelerate beyond a “slowish” foom to a much faster one; but it’s not certain if the discoveries enabling the really fast one will be made, it’s just a possibility).
Yep, agreed. Stronger-than-expected jump unlikely but possible and should be guarded against. As for the 2 OOM speed.… I agree, it’s substantially faster than what we’ve been experiencing so far. Think of GPT4 getting 100x stronger/smarter over the course of a year. That’s plenty enough to be scary I think.
I feel like the post proves too much: it gives arguments for why foom is unlikely, but I don’t see arguments which break the symmetry between “humans cannot foom relative to other animals” and “AI cannot foom relative to humans”.* For example, the statements
and
seem irrelevant or false in light of the human-chimp example. (Are animal brains pareto-efficient? If not, I’m interested in what breaks the symmetry between humans and other animals. If yes, pareto-efficiency doesn’t seem that useful for making predictions on capabilities/foom.)
*One way to resolve the situation is by denying that humans foomed (in a sense relevant for AI), but this is not the route taken in the post.
Separately, I disagree with many claims and the overall thrust in the discussion of AlphaZero.
This seems unlikely to me, depending on what “completely flatlines” and “extremely small size” mean.
Go and chess being small/simple doesn’t seem like the reason why ANNs are way better than brains there. Or, if it is, we should see the difference between ANNs and brains shrinking as the environment gets larger/more complex. This model doesn’t seem to lead to good predictions, though: Dota 2 is a lot more complicated than Go and chess, and yet we have superhuman performance there. Or how complicated exactly does a task need to be before ANNs and brains are equally good?
(Perhaps relatedly: There seems to be an implicit assumption that AGI will be an LLM. “The AGI we actually have simply reproduces [cognitive biases], because we train AI on human thoughts”. This is not obvious to me—what happened to RL?)
On a higher level, the whole train of reasoning reads like a just-so story to me: “We have obtained superhuman performance in Go, but this is only because of training on vastly more data and the environment being simple. As the task gets more complicated the brain becomes more competitive. And indeed, LLMs are close to but not quite human intelligences!”. I don’t see this is as a particularly good fit to the datapoints, or how this hypothesis is likelier than “There is room above human capabilities in ~every task, and we have achieved superhuman abilities in some tasks but not others (yet)”.
My model predicts superhuman AGI in general—just that it uses and scales predictably with compute.
Dota 2 is only marginally more complicated than go/chess; the world model is still very very simple as it can be simulated perfectly using just a low end cpu core.
Driving cars would be a good start. In terms of game worlds there is probably nothing remotely close, would need to be obviously 3D and very open ended with extremely complex physics and detailed realistic graphics, populated with humans and or advanced AI (I’ve been out of games for a while and i’m not sure what that game currently would be, but probably doesn’t exist yet).
In the section “Seeking true Foom”, the post argues that the reason why humans foomed is because of culture, which none of the animals before us had. IMO, this invalidates the arguments in the first half of your comment (though not necessarily your conclusions).
This post seems about 90% correct, and written better than your previous posts.
I expect nanotech will be more important, someday, than you admit. But I agree that it’s unlikely to be relevant to foom. GPUs are close enough to nanotech that speeding up GPU production is likely more practical than switching to nanotech.
I suspect Eliezer believes AI could speed up GPU production dramatically without nanotech. Can someone who believes that explain why they think recent GPU progress has been far from optimal?
I don’t see anything naive about the argument that you quoted here (which doesn’t say how much less compute). Long, fast chains of serial computation enable some algorithms that are hard to implement on brains. So it seems obvious that such systems will have some better-than-human abilities.
Eliezer doesn’t seem naive there until he jumps to implying a 6 OOM advantage on tasks that matter. He would be correct here if there are serial algorithms that improve a lot on the algorithms that matter most for human intelligence. It’s not too hard to imagine that evolution overlooked such serial algorithms.
Recent patterns in computing are decent evidence that key human pattern-recognition algorithms can’t be made much more efficient. That seems to justify maybe 80% confidence that Eliezer is wrong here. My best guess is that Eliezer focuses too much on algorithms where humans are weak.
I think you are missing obvious things in this analysis that you have literally told me about. you know of at least one source of a foom like jump that you haven’t mentioned for intellectual property reasons. I’d appreciate you keeping it that way. i do think that you’re right about a specific narrow limitation of software foom because you’ve thought about how it can and can’t happen, but I also don’t think you’re considering a wide enough space of possibility. why do you think reversible computers are more than a decade out? surely the first agentic ASIs finish the job rather quickly.
I do think this means that specific dangerous architectures are much more the threat than one might expect because eg mishka’s comment about hinton-level ais, and so we should be very scared of archs that produce highly competent hyper-desperate adversarial-example-wanters, because it is the combination of competence and desperation that is likely to disobey its friends and parents. even very strong predictive models do not generate hyperdesperate sociopaths, and in fact hyperdesperate beings would destroy language models the same as they’d destroy humans. strategic skill and desperation are what kill us suddenly, and the path to power capture looks like ordinary economic takeoff until the hyperdesperate being has enough power to be a real threat.
even a team of Hinton level digital minds need not be a severe threat—unless they are not sufficiently spooked by the threat of hyperdesperate squigglewanters. A being that wants to get really really high on an adversarial example but who is skilled enough at planning to not be broken by this is yudkowsky’s monster.
Reversible computers don’t seem much easier than quantum computers for some of the reasons explained here, and maintaining near zero temp doesn’t really seem practical at scale—at least not on earth.
I think the roodman model is the current best predictive model of the near future. It predicts hard singularity in 2047, but growth is hyperexponential and doesn’t really start getting wierd until later this decade, and it doesn’t even require neuromorphic computing until the 2030′s perhaps, and exotic computing and everything after all happens right near the end.
It seems here that you are really worried about ‘foom in danger’ (danger per intelligence, D / I) than regular foom (4+ OOM increase in I), if I am reading you correctly. Like I don’t see a technical argument that eg. the claims in OP about any of
are wrong, you are just saying that ‘D / I will foom at some point’ (aka a model becomes much more dangerous quickly, without needing to be vastly more powerful algorithmically or having much more compute).
This doesn’t change things much but I just want to understand better what you mean when you say ‘foom’.
I don’t think I should clarify further right now, though I could potentially be convinced otherwise. I’d need to think about precisely what I want to highlight. It’s not like it’ll be that long before it becomes glaringly obvious, but I don’t currently see a reason why clarifying this particular aspect makes us safer.
Thats fair however, I would say that the manner of foom determines a lot about what to look out for and where to put safeguards.
If it’s total($) thats obvious how to look out.
flop/$ also seems like something that eg. NVIDIA is tracking closely, and per OP probably can’t foom too rapidly absent nanotech.
So the argument is something about the (D*I)/flop dynamics.
[redacted] I wrote more here but probably its best left unsaid for now. I think we are on a similar enough page.
Is “adversarial-example-wanters” referring to an existing topic, or something you can expand on here?
paperclippers!
The key thing I disagree with is:
Although I think agree the ‘meta-systems transition’ is a super important shift, which can lead us to overestimate the level of difference between us and previous apes, it also doesn’t seem like it was just a one time shift. We had fire, stone tools and probably language for literally millions of years before the Neolithic revolution. For the industrial revolution it seems that a few bits of cognitive technology (not even genes, just memes!) in renaissance Europe sent the world suddenly off on a whole new exponential.
The lesson, for me, is that the capability level of the meta-system/technology frontier is a very sensitive function of the kind of intelligences which are operating, and we therefore shouldn’t feel at all confident generalising out of distribution. Then, once we start to incorporate feedback loops from the technology frontier back into the underlying intelligences which are developing that technology, all modelling goes out the window.
From a technical modelling perspective, I understand that the Roodman model that you reference below (hard singularity at median 2047) has both hyperbolic growth and random shocks, and so even within that model, we shouldn’t be too surprised to see a sudden shift in gears and a much sooner singularity, even without accounting for RSI taking us somehow off-script.
To expand on the idea of meta-systems and their capability: Similarly to discussing brain efficiency, we could ask about the efficiency of our civilization (in the sense of being able to point its capability to a unified goal), among all possible ways of organising civilisations. If our civilisation is very inefficient, AI could figure out a better design and foom that way.
Primarily, I think the question of our civilization’s efficiency is unclear. My intuition is that our civilization is quite inefficient, with the following points serving as weak evidence:
Civilization hasn’t been around that long, and has therefore not been optimised much.
The point (1) gets even more pronounced as you go from “designs for cooperation among a small group” to “designs for cooperation among milions”, or even billions. (Because fewer of these were running in parallel, and for a shorter time.)
The fact that civilization runs on humans, who are selfish etc, might severely limit the space of designs that have been tried.
As a lower bound, it seems that something like Yudkowsky’s ideas about dath ilan might work. (Not to be mistaken with “we can get there from here”, “works for humans”, or “none of Yudkowsky’s ideas have holes in them”.)
None of this contradicts your arguments, but it adds uncertainty and should make us more cautios about AI. (Not that I interpret the post as advocating against caution.)
Yes in the sense that if you zoom in you’ll see language starting with simplistic low bit rate communication and steadily improving, followed by writing for external memory, printing press, telecommunication, computers, etc etc. Noosphere to technosphere.
But those improvements are not happening in human brains, they are cybernetic externalized.
Yeah I agree it’s not in human brains, not really disagreeing with the bulk of the argument re brains but just about whether it does much to reduce foom %. Maybe it constrains the ultra fast scenarios a bit but not much more imo.
“Small” (ie << 6 OOM) jump in underlying brain function from current paradigm AI → Gigantic shift in tech frontier rate of change → Exotic tech becomes quickly reachable → YudFoom
Why do you think this? (I’m unconvinced by “universal learning machine” type things that I’ve seen, not because I disagree, but because they don’t seem to address transitions within the shape of what stuff is learned and how it interacts.)
There are NNs that train for a lifetime then die, and there are NNs that train for a lifetime but then network together to share all their knowledge before dying. There are not ever more levels to that.
There are turing universal computational systems (which are all equivalent in universal ability to simulate other systems), and there are non-universal computational systems. There are not more levels to that.
But crucially, humans do not share all their knowledge. Every time a great scientist or engineer or manager or artist dies, a ton of intuition and skills and illegible knowledge dies with them. What is passed on is only what can be easily compressed into the extremely lossy channels of language.
As the saying goes, “humans are as stupid as they can be while still undergoing intelligence-driven takeoff at all”; otherwise humans would have taken over the world sooner. That applies to knowledge sharing in particular—our language channels are just barely good enough to take off.
Even just the ability to copy a mind would push AIs far further along the same direction. Ability to merge minds would go far further still.
Edit: Of course humans do not share all their knowledge, and the cultural transition is obviously graded in the sense that the evolutionary stages of early language, writing, printing press, computers, internet etc gradually improve the externalized network connectivity and storage of our cybernetic civilization. But by the time of AGI that transition is already very well along, such that all we are really losing—as you point out and I agree—is a ton of intuitions/skills/knowledge etc that dies with the decay of human brains, but we externalize much of the most important of our knowledge. Nonetheless ending that tragedy is our great common cause.
I agree that substrate independence is one of the great advantages of digital minds, other than speed.
But there are some fundamental tradeoffs:
You can use GPUs (von neumman) which separate compute and logic. They are much much slower in the sense that they take many many cycles to simulate one cycle of a large ANN. They waste much energy having to shuffle the weights around the chip from memory to logic.
Or you can use neuromorphic computers, which combine memory and logic. They are potentially enormously faster as they can simulate one cycle of a large ANN per clock cycle, but constrained to more brain like designs and thus optimized for low circuit depth but larger circuits (cheap circuitry). For the greatest cheap circuit density, energy efficiency, and speed you need to use analog synapses but in doing so you basically give up the ability to easily transfer the knowledge out of the system—it becomes more ‘mortal’ as hinton recently argues.
This seems like a small tradeoff, and this does not seem like a big enough deal to restore these to anything like human mortality, with all its enormous global effects. It may be much harder to copy weights off a idiosyncratic mess of analogue circuits modified in-place by their training to maximize energy efficiency than it is to run
cp foo.pkl bar.pkl
, absolutely, but the increase in difficulty here seems more on par with ‘a small sub-field with a few hundred grad students/engineers for a few years’ than ‘the creation of AGI’, and so one can assume it’d be solved almost immediately should it ever actually become a problem.For example, even if it’s ultra-miniaturized, you can tap connections to optionally read off activations between many pairs of layers, which will affect only a small part of it and not eliminate the miniaturization or energy savings—and with the layer embeddings summarizing a group of layers, now you can do knowledge distillation to another such neuromorphic computer (or smaller). Knowledge distillation, or self-distillation rather, will cost little and works well. Or, since you can presumably set the analogue values even if you can’t read them, and have a model worth copying, you can pay the one-time cost to distill it out to a more von-Neumann computer, one where you can more easily read the weights out, and thence copy it onto all of the other neuromorphics henceforth. Or, you can reverse-engineer the weights themselves: probe the original and the copy with synthetic data flipping a bit at a time to run finite-differences on outputs like activations/embeddings, starting at the lowest available tap, to eventually reconstruct the equivalent weights group by group. (This may require lots of probes, but these systems by definition run extremely fast and since you’re only probing a small part of it at a time, run even faster than that.) Just off the cuff, and I’m sure you could think of several better approaches if you tried. So I don’t expect ‘mortal’ NNs to be all that different from our current ‘immortal’ NNs or things like FPGAs.
Largely agreed, which is partly why I said only more ‘mortal’ with ‘mortal’ in scare quotes. Or put another way, the full neuromorphic analog route still isn’t as problematic to copy weights out of vs an actual brain, and I expect actual uploading to be possible eventually so … it’s mostly a matter of copy speeds and expenses as you point out, and for the most hardcore analog neuromorphic designs like brains you still can exploit sophisticated distillation techniques as you discuss. But it does look like there are tradeoffs that increase copy out cost as you move to the most advanced neuromorphic designs.
This whole thing is just thought experiment, correct? “what we would have to do to mimic the brain’s energy efficiency”. Because analog synapses where we left off a network of analog gates to connect any given synapse to an ADC (something that current prototype analog inference accelerators use, and analog FPGAs do exist) are kinda awful.
The reason is because of https://openai.com/research/emergent-tool-use . What they found in this paper was that you want to make your Bayesian updates to your agent’s policy in large batches. Meaning you need to be able to copy the policy many times across a fleet of hardware that runs in separate agents, and learn the expected value and errors of the given policy across a larger batch of episodes. The copying requires precise reading of the values, so they need to be binary, and there is no benefit from modifying the policy rapidly in real time.
The reason why we have brains that learn rapidly in real time, overfitting to a small number of strong examples, is because this was all that was possible with the hardware nature could evolve. It is suboptimal.
I think Jake is right that we shouldn’t imagine an unlimited set of levels of learning. I however do think that there are one or two more levels beyond self learning, and cultural transmission. The next level ( which could maybe be described as two levels) is not something that evolution has managed in any mammalian species:
take an existing brain which has filled most of its learning capacity and is beginning to plateau in skill-gain-from-experience and add significantly more capacity.
Make significant architectural changes involving substantial change to long distance wiring. For example, if I were to rewire half of my visual cortex to instead be part of my mathematical reasoning module. Both of these are sort of examples of plasticity/editability. I expect that if we had the ability to do either on of these to a human (e.g. via brain-computer interface) then you could turn a below average IQ human into an impressively skilled mathematician. And you could turn an impressively skilled mathematician into the greatest math genius in the history of the human race. If I am correct about this, then I think it is fair to consider this a fundamentally different level than cultural knowledge transmission.
(Copied from another comment) Nathan points out increasing size; and large scale / connective plasticity. Another one would be full reflectivity: introspection and self-reprogramming. Another one would be the ability to copy chunks of code and A/B test them as they function in the whole agent. I don’t get why Jacob is so confident that these sorts of things aren’t major and/or that there aren’t more of them than we’ve thought of.
But why do you think that? It seems like things like the methods of science, and like mathematical insights, both enhance intelligence qualitatively.
I think this is partially a matter of ontological taste. I mean, you are obviously correct that many innovations coming after the transition the author is interested in seem to produce qualitative shifts in the collective intelligence of humanity. On the other hand, if you take the view that all of these are fundamentally enabled by that first transition, then it seems reasonable to treat that as special in a way that the other innovations are not.
I suppose where the rubber meets the road, if one grants both the special status of the transition to universal cultural learning and that other kinds of innovation can lead to qualitative shifts in collective intelligence, is whether or not further innovations of the second kind can still play the role that foom is supposed to play in EY’s argument (I take Nathan Helm-Burger’s comment to be one argument that such innovations can play this role).
I don’t necessarily care too much about which ones are “special” or “qualitative”, though I did say qualitative. The practical question at hand is how much more intelligence can you pack into given compute, and how quickly can you get there. If a mathematical insight allows you to write code that’s shorter, and runs significantly faster and with less memory requirements, and gives outputs that are more effective, then we’ve answered most of the practical question. History seems chock full of such things.
But yeah I also agree that there’s other more “writ large” sorts of transitions.
Nathan points out large scale / connective plasticity. Another one would be full reflectivity: introspection and self-reprogramming. Another one would be the ability copy chunks of code and A/B test them as they function in the whole agent. I don’t get why Jacob is so confident that these sorts of things aren’t major and/or that there aren’t more of them than we’ve thought of.
At the risk of going round in circles, you begin your post by saying you don’t care which ones are special or qualitative, and end it by wondering why the author is confident certain kinds of transition are not “major”. Is this term, like the others, just standing in for ‘significant enough to play a certain kind of role in an “AI leads to doom” argument’? Or does it mean something else?
I get the impression that you want to avoid too much wrangling over which labels should be applied to which kinds of thing, but then, you brought up the worry about the original post, so I don’t quite know what your point is.
It just means specific innovations that have especially big increases in intelligence. But I think that lots of innovations, such as mathematical ideas, have big increases in intelligence.
Okay, sure. If my impression of the original post is right, the author would not disagree with you, but would rather claim that there is an important distinction to be made among these innovations. Namely, one of them is the 0-1 transition to universality, and the others are not. So, do you disagree that such a distinction may be important at all, or merely that it is not a distinction that supports the argument made in the original post?
It would be a large, broad increase in intelligence. There may be other large broad increases in intelligence. I think there are also other large narrow increases, and small broad increases. Jacob seems to be claiming that there aren’t further large increases to be had. I think the transition to universality is pretty vague. Wouldn’t increasing memory capacity also be a sort of increase in universality?
I have to say I agree that there is vagueness in the transition to universality. That is hardly surprising seeing as it is a confusing and contentious subject that involves integrating perspectives on a number of other confusing and contentious subjects (language, biological evolution, cultural evolution, collective intelligence etc...). However, despite the vagueness, I personally still see this transition, from being unable to accrete cultural innovations to being able to do so, as a special one, different in kind from particular technologies that have been invented since.
Perhaps another way to put it is that the transition seems to bestow on us, as a collective, a meta-ability to obtain new abilities (or increased intelligence, as you put it), that we previously lacked. It is true that there are particular new abilities that are particularly valuable, but there may not be any further meta-abilities to obtain.
Just so we aren’t speaking past each other. Do you get what I am saying here? Even if you disagree that this is relevant, which may be reasonable, does the distinction I am driving at even make sense to you, or still not?
No, I don’t see a real distinction here. If you increase skull size, you increase the rate at which new abilities are invented and combined. If you come up with a mathematical idea, you advance a whole swath of ability-seeking searches. I listed some other things that increase meta-ability. What’s the distinction between various things that hit back to the meta-level?
There is an enormous difference between “increase skull size” when already well into diminishing returns for brain size given only 1e9s of training data, and an improvement that allows compressing knowledge, externalizing it, and sharing it permanently to train new minds.
After that cultural transition, each new mind can train on the compressed summary experiences of all previous minds of the tribe/nation/civilization. You go from having only 1e9s of training data that is thrown away when each individual dies, to having an effective training dataset that scales with total extant integrated population over time. It is a radical shift to a fundemental new scaling equation, and that is why it is a metasystems transition, whereas increasing skull size is not.
Increasing skull size would also let you have much larger working memory, have multiple trains of thought but still with high interconnect, etc., which would let you work on problems that are too hard to fit in one normal human’s working memory.
I simply don’t buy the training data limit. You have infinite free training data from internal events, aka math.
More zoomed out, I still haven’t seen you argue why there aren’t more shifts that change the scaling equation. (I’ve listed some that I think would do so.)
The distinction is that without the initial 0-1 phase transition, none of the other stuff is possible. They are all instances of cumulative cultural accretion, whereas the transition constitutes entering the regime of cumulative cultural accretion (other biological organisms and extant AI systems are not in this regime). If I understand the author correctly, the creation of AGI will increase the pace of cumulative cultural accretion, but will not lead us (or them) to exit that regime (since, according to the point about universality, there is no further regime).
I think this answer also applies to the other comment you made, for what it’s worth. It would take me more time than I am willing to spend to make a cogent case for this here, so I will leave the discussion for now.
Ok. I think you’re confused though; other things we’ve discussed are pretty much as 0 to 1 as cultural accumulation.
Innovations that unlock a broad swath of further abilities could be called “qualitatively more intelligent”. But 1. things that seem “narrow”, such as many math ideas, are qualitative increases in intelligence in this sense; and 2. there’s a lot of innovations that sure seem to obviously be qualitative increases.
I broadly disagree with Yudkowsky on his vision of FOOM and think he’s pretty sloppy wrt. AI takeoff overall.
But, I do think you’re quite likely to get a quite rapid singularity if people don’t intentionally slow things down. For instance, I broadly think the modeling in Tom Davidson’s takeoff speeds report seems very reasonable to me. Except that I think the default parameters he uses are insufficiently aggressive (I think compute requirements are likely to be somewhat lower than given in this report). Notably this model doesn’t get you FOOM in a week (perhaps more like 20% automation to 100% automation in a few years) and is mostly just doing non-mechanistic trend expolation combined with a growth model.
So, I think it would be much more interesting from my perspective to engage with this report than Yudkowsky.
See also this section where Tom talks about kinks in the underlying capabilities leading to rapid progress
I am not a domain expert, but I get the impression that the primary factors of Pareto-frontier for software industry is “consumer expectations” and “money costs”, and primary component of money costs is “programmer labor”, so software development goes mostly on the way “how to satisfy consumer expectations with minimum possible labor costs”, which doesn’t put much optimisation pressure on computing efficiency. I frankly expect that if we spend bazillion dollars on optimisation, we can at least halve required computing power for “Witcher 3”. Demoscene proves that we can put many things in 64KBytes of space.
Is there really such a strong line between standard computing and reversible computing? As I understand it, you usually have a bunch of bits you don’t care about after doing a reversible computation. So you either have to store these bits somewhere indefinitely, or eventually erase them radiating heat. That makes it possible to reframe a reversible computer as one in which you perfectly cool/remove the heat generated via computation (and maybe dissipate the saved bits far away or whatever). Under this reframe, you can see how we could potentially have really good but imperfect cooling which approaches this ideal (and makes me think it’s not a coincidence that good electrical conductors tend to be good heat conductors). Now, there might still be a “soft line” which makes approaching this hard in practice, like the clock issue you mention, but maybe it is possible to incrementally advance current semiconductor tech to the reversible computing limit or at least get pretty close.
So it looks like CMOS adiabatic circuits are an existing technology which appears to lie in the space between conventional and reversible computation. According to Wikipedia it says they take up 50% more area (unclear if that refers to ~transistor size or ~equivalent computation unit size). It seems plausible that you could still use this to get denser compute overall since you could stack them in 3D more densely without having excess heat be as much of a problem.
Adiabatic circuits are just partially reversible circuits. They are mostly research designs with a few test chips, but from what I see all the test chips are on old large nodes such that they don’t achieve energy gains in practice. From what I can tell they are absolutely not used at all in leading products like GPUs.
There is some debate in the industry, but essentially no big labs are pursuing adiabatic/reversible computing, it’s a few small researchers (notably Mike Frank is carrying much of the load).
The critics argue that adiabatic/reversible computing is not really practical in a conventional warm environment because of noise build up, and after investigating I believe these arguments are probably correct.
The brain, like current CMOS tech, is completely irreversible. Reversible computation is possible in theory but is exotic like quantum computation requiring near zero temp and may not be practical at scale on a noisy environment like the earth, for the reasons outlined by Cavin/Zhirnov here and discussed in a theoretical cellular model by Tiata here—basically fully reversible computers rapidly forget everything as noise accumulates. Irreversible computers like brains and GPUs erase all thermal noise at every step, and pay the hot iron price to do so.
Just want to add that our inverted retina brings additional problems with it. For example, the optic nerve is exposed to wear and tear from increased intraocular pressure. This condition is called glaucoma and is one of the prime causes of irreversible vision loss in humans worldwide. Cephalopods cannot get glaucoma.
It is not clear to me that a world that relies on machines is safer from a world that does not. Intuitively, I’d expect the dangerous AI being connected through the internet to tons of other systems to be a risk factor.
I think it is worth highlighting that this is an assumption, not a necessary fact about how civilization works. To put it into a nonsense-math:
Assumption: Suppose that the current (technology assisted) capability of an average human is X, and that forming a singleton would require capability X+C. Then if the technology-assisted capability of average human increases to Y, forming a singleton would require capability ≥Y+C.
I certainly don’t mean to say that this is definitely not true. However, I think it is far from obvious. In practice, I expect some technologies to make takeover harder, and others to make it easier, with the overall trend being unclear. (I would bet on overall making it harder, but with very high uncertainty.) Some reasons for the non-obviousness:
The adoption of capabilities will be uneven. (EG, Google might increase their cybersecurity, but the same might not be for Backwater State University. I might keep up-to-date with AI being able to do impersonation scams, but my grandma won’t.)
Some takeover strategies might only require taking control of some percentage of population (infrastructure, resources, …) rather than the most capable/well-defended population (infractructure, resources, …). To give a non-takeover-strategy example: manipulating presidential election works like this. [Don’t mistake “I can’t identify any strategy like this” with “there is no such strategy”.]
I expect that as we get more technology, the world will naturally grow more robust against exploits that people actually try, or expect others to try. However, most people are not psychopaths (or, even worse, fanatic psychopaths). So some of the vulnerabilities might remain unfixed.
Some technologies that make us more capable also make us more vulnerable. EG, everybody having their personal AutoGPT, with access to all their passwords, that automatically uses the latest LLM definitely increases our capabilities. But it also creates a single point of failure.
This seems clearly wrong:
Go is extremely simple: the entire world of Go can be precisely predicted by trivial tiny low depth circuits/programs. This means that the Go predictive capability of a NN model as a function of NN size completely flatlines at an extremely small size. A massive NN like the brain’s cortex is mostly wasted for Go, with zero advantage vs the tiny NN AlphaZero uses for predicting the tiny simple world of Go.
Top go-playing programs utilize neural networks, but they are not neural networks. Monte-Carlo Tree Search boosts their playing strength immensely. The underlying pure policy networks would be strong amateur level when playing against opponents who are unaware that they are playing a pure neural network, but they would lose quite literally every game against top humans. It seems very likely that a purely NN-based player without search would have to be based on a far more complex neural network than the ones we see in, say, Leela Zero or Katago. In addition, top programs like Katago use some handcrafted features (things about the current game state that can be efficiently computed by traditional hand-written code, but would be difficult to learn or compute for a neural network), so they deviate to a significant extent from the paradigm of pure reinforcement learning via self-play from just the rules that AlphaZero proved viable. This, too, significantly improves their playing strength.
Finally, Go has a very narrow (or, with half-integer komi and rulesets that prevent long cycles, non-existent) path to draw, and games last for about 250 moves. That means that even small differences in skill can be reliably converted to wins. I would guess that the skill ceiling for Go (and thereby, the advantage that a superintelligence would have in Go over humans or current go-playing machines) is higher than in most real-life problems. Go is as complicated as the opponent makes it. I would, for these reasons, in fact not be too surprised if the best physically realizable go-playing system at tournament time controls with hardware resources, say, equivalent to a modern-day data center would include a general intelligence (that would likely adjust parameters or code in a more specialized go-player on the fly, when the need arises).
None of which really contradicts what I said.
A general AGI (ala AIXI) requires a predictive world model and a planning system. The compute cost scales with cost of the world model.
It takes only a tiny NN to perfectly predict the ‘world’ of Go. Also, neural networks can and do implement search, and MCT search is rather obviously not the optimal scalable planning algorithm across all worlds/situations. Finally, the real world is not only more complex in terms of size, but also unknown and stochastic, all of which greatly reduces the payoff of MCT style deep tree search planning.
If you understand the key of my point and still disagree, why is that small NNs + MCT fail to scale to more complex environments? What is your alternate explanation for why they have not already produced superintelligence—let alone AGI?
I would disagree with the notion that the cost of mastering a world scales with the cost of the world model. For instance, the learning with errors problem has a completely straightforward mathematical description, and yet strong quantum-resistant public-key cryptosystems can be built on it; there is every possibility that even a superintelligence a million years from now will be unable to read a message encrypted today using AES-256 encapsulated using a known Kyber public key with conservatively chosen security parameters.
Similarly, it is not clear to me at all what is even meant by saying that a tiny neural network can perfectly predict the “world” of Go. I would expect that even predicting the mere mechanics of the game, for instance determining that a group has just been captured by the last move of the opponent, will be difficult for small neural networks when examples are adversarially chosen (think of a group that snakes around the whole board, overwhelming the small NN capability to count liberties). The complexity of determining consequences of actions in Go is much more dependent on the depth of the required search than on the size of the game state, and it is easy to find examples on the 19x19 standard board size that will overwhelm any feed-forward neural network of reasonable size (but not necessarily networks augmented with tree search).
With regards to FOOM, I agree that doom from foom seems like an unlikely prospect (mainly due to diminishing returns on the utility of intelligence in many competitive settings) and I would agree that FOOM would require some experimental loop to be closed, which will push out time scales. I would also agree that the example of Go does not show what Yudkowsky thinks it does (it does help that this is a small world where it is feasible to do large reinforcement learning runs, and even then, Go programs have mostly confirmed human strategy, not totally upended it). But the possibility that if an unaided large NN achieved AGI or weak ASI, it would then be able to bootstrap itself to a much stronger level of ASI in a relatively short time (similar to the development cycle timeframe that led to the AGI/weak ASI itself; but involving extensive experimentation, so neither undetectable nor done in minutes or days) by combining improved algorithmic scaffolding with a smaller/faster policy network does not seem outlandish to me.
Lastly, I would argue that foom is in fact an observable phenomenon today. We see self-reinforcing, rapid, sudden onset improvement every time a neural network during training discovers a substantially new capability and then improves on it before settling into a new plateau. This is known as grokking and well-described in the literature on neural networks; there are even simple synthetic problems that produce a nice repeated pattern of grokking at successive levels of performance when a neural network is trained to solve them. I would expect that fooming can occur at various scales. However, I find the case that a large grokking step automatically happens when a system approaches human-level competence on general problem unconvincing (on the other hand, of course a large grokking step could happen in a system already at human-level competence by chance or happenstance and push into the weak ASI regime in a short time frame).
By world model I specifically meant a model of the world physics. For chess/go this is just a tiny amount of memory to store the board state, and a simple set of rules that are very fast to evaluate. I agree that evaluating the rules of go is a bit more complex than chess, especially in edge cases, but still enormously simpler than evaluating the physics of the real world.
I think we probably agree about grokking in NNs but I am doubting that EY would describe that as foom.
I don’t know much about Leela Zero and Katago but I do know that Leela Chess Zero (lc0) without search (pure policy) is near superhuman levels. I’ll see if I can dig up more precise specifics.
The LC0 pure policy is most certainly not superhuman. To test this, I just had it (network 791556, i.e. standard network of the current LC0 release) play a game against a weak computer opponent (Shredder Chess Online). SCO plays maybe at the level of a strong expert/weak candidate master at rapid chess time controls (but it plays a lot faster, thereby making generation of a test game more convenient than trying to beat policy-only lc0 myself, which I think should be doable). Result was draw, after lc0 first completely outplayed SCO positionally, and then blundered tactically in a completely won position, with a strong-looking move that had a simple tactical refutation. It then probably still had a very slight advantage, but opted to take a draw by repetition.
I think policy-only lc0 plays worse relative to strong humans than Katago/LeelaZero in Go. I would attribute this to chess being easier to lose by tactical blunder than Go.
791556 is nowhere near the strongest network available. It’s packaged with lc0 as a nice small net. The BT2 net currently playing at tcec-chess.com is several hundreds of Elo stronger than T79 and likely close to superhuman level, depending on the time control. It’s not the very latest and greatest, but it is publicly available for download and should work with the 0.30.0-rc1 pre-release version of lc0 that supports the newer transformer architecture if you want to try it yourself. If you only want completely “official” nets, at least grab one of the latest networks from the main T80 run.
I’m not confident that BT2 is strictly superhuman using pure policy but I’m pretty sure it’s at least close. LazyBot is a Lichess bot that plays pure policy but uses a T80 net that is likely at least 100 Elo weaker than BT2.
Thanks for the information. I’ll try out BT2. Against LazyBot I was just then able to get a draw in a blitz game with 3 seconds increment, which I don’t think I could do within a few tries against an opponent of, say, low grandmaster strength (with low grandmaster strength being quite far way away from superhuman still). Since pure policy does not improve with thinking time, I think my chances would be much better at longer time controls. Certainly its lichess rating at slow time controls suggests that T80 is not more than master strength when its human opponents have more than 15 minutes for the whole game.
Self-play elo vastly exaggerates playing strength differences between different networks, so I would not expect a BT2 vs T80 difference of 100 elo points to translate to close to 100 elo playing strength difference against humans.
Yes, clearly the less time the human has, the better Leela will do relatively. One thing to note though is that Lichess Elo isn’t completely comparable across different time controls. If you look at the player leaderboard, you can see that the top scores for bullet are ~600 greater than for classical, so scores need to be interpreted in context.
Self-Elo inflation is a fair point to bring up and I don’t have information on how well it translates.
I would still worry about software efficiency not holding for AGI that starts from just couple of OOM advantage because of higher frequency.
This seems to me as claiming that an AI that has some or all of the boundary conditions of the human brain can’t become any more efficient with respect to its power requirements, rather than saying that it’s theoretically impossible to construct a computer smarter than human brain that requires less power, which is what EY’s statement was about.
Which specific statement? There’s a few
Also, the brain being Pareto efficient would mean that one property of the brain can’t be improved without another property worsening. It wouldn’t mean there is no n-tuple of values of the n properties such that, if those properties had those values, the brain would become more intelligent with the same power requirements.