I have a compute-market startup called vast.ai, and I’m working towards aligned AI. Currently seeking networking, collaborators, and hires—especially top notch cuda/gpu programmers.
My personal blog: https://entersingularity.wordpress.com/
I have a compute-market startup called vast.ai, and I’m working towards aligned AI. Currently seeking networking, collaborators, and hires—especially top notch cuda/gpu programmers.
My personal blog: https://entersingularity.wordpress.com/
I support this and will match the $250 prize.
Here are the central background ideas/claims:
1.) Computers are built out of components which are also just simpler computers, which bottoms out at the limits of miniaturization in minimal molecular sized (few nm) computational elements (cellular automata/tiles). Further shrinkage is believed impossible in practice due to various constraints (overcoming these constraints if even possible would require very exotic far future tech).
2.) At this scale the landauer bound represents the ambient temperature dependent noise (which can also manifest as a noise voltage). Reliable computation at speed is only possible using non-trivial multiples of this base energy, for the simple reasons described by landauer and elaborated on in the other refs in my article.
3.) Components can be classified as computing tiles or interconnect tiles, but the latter is simply a computer which computes the identity but moves the input to an output in some spatial direction. Interconnect tiles can be irreversible or reversible, but the latter has enormous tradeoffs in size (ie optical) and or speed or other variables and is thus not used by brains or GPUs/CPUs.
4.) Fully reversible computers are possible in theory but have enormous negative tradeoffs in size/speed due to 1.) the need to avoid erasing bits throughout intermediate computations, 2.) the lack of immediate error correction (achieved automatically in dissipative interconnect by erasing at each cycle) leading to error build up which must be corrected/erased (costing energy), 3.) high sensitivity to noise/disturbance due to 2
And the brain vs computer claims:
5.) The brain is near the pareto frontier for practical 10W computers, and makes reasonably good tradeoffs between size, speed, heat and energy as a computational platform for intelligence
6.) Computers are approaching the same pareto frontier (although currently in a different region of design space) - shrinkage is nearing its end
Remembering and imagination share the same pathways and are difficult to distinguish at the neuro circuit level. The idea of recovered memories was already discredited decades ago after the peak of the satanic ritual abuse hysteria/panic of the 80′s. At its peak some parents were jailed based on testimonies of children, children that had been coerced (both deliberately and indirectly) into recanting fantastical, increasingly outlandish tales of satanic baby eating rituals. The FBI even eventually investigated and found 0 evidence, but the turning point was when some lawyers and psychiatrists started winning lawsuits against the psychologists and social workers at the center of the recovered memory movement.
Memories change every time they are rehearsed/reimagined; the magnitude of such change varies and can be significant, and the thin separation between imaginings (imagined memories, memories/stories of others, etc) and ‘factual’ memories doesn’t really erode so much as not really exist in the first place.
Nonetheless, some people’s detailed memories from childhood are probably largely accurate, but some detailed childhood memories are complete confabulations based on internalization of external evidence, and some are later confabulations based on attempts to remember or recall and extensive dwelling on the past, and some are complete fiction. No way with current tech to distinguish between, even for the rememberer.
I feel like even under the worldview that your beliefs imply, a superintelligence will just make a brain the size of a factory, and then be in a position to outcompete or destroy humanity quite easily.
I am genuinely curious and confused as to what exactly you concretely imagine this supposed ‘superintelligence’ to be, such that is not already the size of a factory, such that you mention “size of a factory” as if that is something actually worth mentioning—at all. Please show at least your first pass fermi estimates for the compute requirements. By that I mean—what are the compute requirements for the initial SI—and then the later presumably more powerful ‘factory’?
Maybe it will do that using GPUs, or maybe it will do that using some more neuromorphic design, but I really don’t understand why energy density matters very much.
I would suggest reading more about advanced GPU/accelerator design, and then about datacenter design and the thermodynamic/cooling considerations therein.
The vast majority of energy that current humans produce is of course not spent on running human brains, and there are easily 10-30 OOMs of improvement lying around without going into density (just using the energy output of a single power plant under your model would produce something that would likely be easily capable of disempowering humanity).
This is so wildly ridiculous that you really need to show your work. I have already shown some calculations in these threads, but I’ll quickly review here.
A quick google search indicates 1GW is a typical power plant output, which in theory could power roughly a million GPU datacenter. This is almost 100 times larger in power consumption than the current largest official supercomputer: Frontier—which has about 30k GPUs. The supercomputer used to train GPT4 is somewhat of a secret, but estimated to be about that size. So at 50x to 100x you are talking about scaling up to something approaching a hypothetical GPT-5 scale cluster.
Nvidia currently produces less than 100k high end enterprise GPUs per year in total, so you can’t even produce this datacenter unless Nvidia grows by about 10x and TSMC grows by perhaps 2x.
The datacenter would likely cost over a hundred billion dollars, and the resulting models would be proportionally more expensive to run, such that it’s unclear whether this would be a win (at least using current tech). Sure I do think there is some room for software improvement.
But no, I do not think that this hypothetical not currently achievable GPT5 - even if you were running 100k instances of it—would “likely be easily capable of disempowering humanity”.
Of course if we talk longer term, the brain is obviously evidence that one human-brain power can be achieved in about 10 watts, so the 1GW power plant could support a population of 100 million uploads or neuromorphic AGIs. That’s very much part of my model (and hansons, and moravecs) - eventually.
Remember this post is all about critiquing EY’s specific doom model which involves fast foom on current hardware through recursive self-improvement.
Having more room at the bottom is just one of a long list of ways to end up with AIs much smarter than humans. Maybe you have rebuttals to all the other ways AIs could end up much smarter than humans
If you have read much of my writings, you should know that I believe its obvious we will end up with AIs much smarter than humans—but mainly because they will run faster using much more power. In fact this prediction has already come to pass in a limited sense—GPT4 was probably trained on over 100 human lifetimes worth of virtual time/data using only about 3 months of physical time, which represents a 10000x time dilation (but thankfully only for training, not for inference).
Your section on the physical limits of hardware computation .. is naive; the dominant energy cost is now interconnect (moving bits), not logic ops. This is a complex topic and you could use more research and references from the relevant literature; there are good reasons why the semiconductor roadmap has ended and the perception in industry is that Moore’s Law is finally approaching it’s end. For more info see this, with many references.
To the connectivists such as myself, your point 0 has seemed obvious for a while, so the EY/MIRI/LW anti-neural net groupthink was/is a strong sign of faulty beliefs. And saying “oh but EY/etc didn’t really think neural nets wouldn’t work, they just thought other paradigms would be safer” doesn’t really help much if no other paradigms ever had a chance. Underlying much of the rationalist groupthink on AI safety is a set of correlated incorrect anti-connectivist beliefs which undermines much of the standard conclusions.
Given that a high stakes all out arms race for frontier foundation AGI models is heating up between the major powers, and meta’s public models are trailing—it doesn’t seem clear at all that open sourcing them is net safety negative. One could argue the benefits of having wide access for safety research along with tilting the world towards multi-polar scenarios outweight the (more minimal) risks.
The merit of this post is to taboo nanotech. Practical bottom-up nanotech is simply synthetic biology, and practical top-down nanotech is simply modern chip lithography. So:
1.) can an AI use synthetic bio as a central ingredient of a plan to wipe out humanity?
Sure.
2.) can an AI use synthetic bio or chip litho a central ingredient of a plan to operate perpetually in a world without humans?
Sure
But doesn’t sound as exciting? Good.
ANNs and BNNs operate on the same core principles; the scaling laws apply to both and IQ in either is a mostly function of net effective training compute and data quality. Genes determine a brain’s architectural prior just as a small amount of python code determines an ANN’s architectural prior, but the capabilities come only from scaling with compute and data (quantity and quality).
So you absolutely can not take datasets of gene-IQ correlations and assume those correlations would somehow transfer to gene interventions on adults (post training in DL lingo). The genetic contribution to IQ is almost all developmental/training factors (architectural prior, learning algorithm hyper params, value/attention function tweaks, etc) which snowball during training. Unfortunately developmental windows close and learning rates slow down as the brain literally carves/prunes out its structure, so to the extent this could work at all, it is mostly limited to interventions on children and younger adults who still have significant learning rate reserves.
But it ultimately doesn’t matter, because the brain just learns too slowly. We are now soon past the point at which human learning matters much.
There are some recent papers—see discussion here - showing that there is a g factor for LLMs, and that it is more predictive than g in humans/animals.
Utilizing factor analysis on two extensive datasets—Open LLM Leaderboard with 1,232 models and General Language Understanding Evaluation (GLUE) Leaderboard with 88 models—we find compelling evidence for a unidimensional, highly stable g factor that accounts for 85% of the variance in model performance. The study also finds a moderate correlation of .48 between model size and g.
I’m here pretty much just for the AI related content and discussion, and only occasionally click on other posts randomly: so I guess I’m part of the problem ;). I’m not new, I’ve been here since the beginning, and this debate is not old. I spend time here specifically because I like the LW format/interface/support much better than reddit, and LW tends to have a high concentration of thoughtful posters with a very different perspective (which I tend to often disagree with, but that’s part of the fun). I also read /r/MachineLearning/ of course, but it has different tradeoffs.
You mention filtering for Rationality and World Modeling under More Focused Recommendations—but perhaps LW could go farther in that direction? Not necessarily full subreddits, but it could be useful to have something like per user ranking adjustments based on tags, so that people could more configure/personalize their experience. Folks more interested in Rationality than AI could uprank and then see more of the former rather than the latter, etc.
AI needs Rationality, in particular. Not everyone agrees that rationality is key, here (I know one prominent AI researcher who disagreed).
There is still a significant—and mostly unresolved—disconnect between the LW/Alignment and mainstream ML/DL communities, but the trend is arguably looking promising.
I think in some sense The Sequences are out of date.
I would say “tragically flawed”: noble in their aspirations and very well written, but overconfident in some key foundations. The sequences make some strong assumptions about how the brain works and thus the likely nature of AI, assumptions that have not aged well in the era of DL. Fortunately the sequences also instill the value of updating on new evidence.
I commend this comment and concur with the importance of hardware, the straw-manning of Moravec, etc.
However I do think that EY had a few valid criticisms of Ajeya’s model in particular—it ends up smearing probability mass over many anchors or sub-models, most of which are arguably poorly grounded in deep engineering knowledge. And yes you can use it to create your own model, but most people won’t do that and are just looking at the default median conclusion.
Moore’s Law is petering out as we run up against the constraints of physics for practical irreversible computers, but the brain is also—at best—already at those same limits. So that should substantially reduce uncertainty concerning the hardware side (hardware parity now/soon), and thus place most of the uncertainty around software/algorithm iteration progress. The important algorithmic advances tend to change asymptotic scaling curvature rather than progress linearly, and really all the key uncertainty is over that—which I think is what EY is gesturing at, and rightly so.
Said pushback is based on empirical studies of how the most powerful AIs at our disposal currently work, and is supported by fairly convincing theoretical basis of its own. By comparison, the “canonical” takes are almost purely theoretical.
You aren’t really engaging with the evidence against the purely theoretical canonical/classical AI risk take. The ‘canonical’ AI risk argument is implicitly based on a set of interdependent assumptions/predictions about the nature of future AI:
fast takeoff is more likely than slow, downstream dependent on some combo of:
continuation of Moore’s Law
feasibility of hard ‘diamondoid’ nanotech
brain efficiency vs AI
AI hardware (in)-dependence
the inherent ‘alien-ness’ of AI and AI values
supposed magical coordination advantages of AIs
arguments from analogies: namely evolution
These arguments are old enough that we can now update based on how the implicit predictions of the implied worldviews turned out. The traditional EY/MIRI/LW view has not aged well, which in part can be traced to its dependence on an old flawed theory of how the brain works.
For those who read HPMOR/LW in their teens/20′s, a big chunk of your worldview is downstream of EY’s and the specific positions he landed on with respect to key scientific questions around the brain and AI. His understanding of the brain came almost entirely from ev psych and cognitive biases literature and this model in particular—evolved modularity—hasn’t aged well and is just basically wrong. So this is entangled with everything related to AI risk (which is entirely about the trajectory of AI takeoff relative to human capability).
It’s not a coincidence that many in DL/neurosci have a very different view (shards etc). In particular the Moravec view that AI will come from reverse engineering the brain, that progress is entirely hardware constrained and thus very smooth and predictable, that is the view turned out to be mostly all correct. (his late 90′s prediction of AGI around 2028 is especially prescient)
So it’s pretty clear EY/LW was wrong on 1. - the trajectory of takeoff and path to AGI, and Moravec et al was correct.
Now as the underlying reasons are entangled, Moravec et al was also correct on point 2 - AI from brain reverse engineering is not alien! (But really that argument was just weak regardless.) EY did not seriously consider that the path to AGI would involve training massive neural networks to literally replicate human thoughts.
Point 3 Isn’t really taken seriously outside of the small LW sphere. By the very nature of alignment being a narrow target, any two random Unaligned AIs are especially unlikely to be aligned with each other. The idea of a magical coordination advantage is based on highly implausible code sharing premises (sharing your source code is generally a very bad idea, and regardless doesn’t and can’t actually prove that the code you shared is the code actually running in the world—the grounding problem is formidable and unsolved)
The problem with 4 - the analogy from evolution—is that it factually contradicts the doom worldview—evolution succeeded in aligning brains to IGF well enough despite a huge takeoff in the speed of cultural evolution over genetic evolution—as evidence by the fact that humans have one of the highest fitness scores of any species ever, and almost certainly the fastest growing fitness score.
The Wright Brothers calculated that their plane would fly—before it ever flew—using reasoning that took no account whatsoever of their aircraft’s similarity to a bird. They did look at birds (and I have looked at neuroscience) but the final calculations did not mention birds (I am fairly confident in asserting). A working airplane does not fly because it has wings “just like a bird”.
Actually the wright brother’s central innovation and the centerpiece of the later aviation patent wars—wing warping based flight control—was literally directly copied from birds. It involved just about zero aerodynamics calculations. Moreover their process didn’t involve much “calculation” in general; they downloaded a library of existing flyer designs from the smithsonian and then developed a wind tunnel to test said designs at high throughput before selecting a few for full-scale physical prototypes. Their process was light on formal theory and heavy on experimentation.
The AIs most capable of steering the future will naturally tend to have long planning horizons (low discount rates), and thus will tend to seek power(optionality). But this is just as true of fully aligned agents! In fact the optimal plans of aligned and unaligned agents will probably converge for a while—they will take the same/similar initial steps (this is just a straightforward result of instrumental convergence to empowerment). So we may not be able to distinguish between the two, they both will say and appear to do all the right things. Thus it is important to ensure you have an alignment solution that scales, before scaling.
To the extent I worry about AI risk, I don’t worry much about sudden sharp left turns and nanobots killing us all. The slower accelerating turn (as depicted in the film Her) has always seemed more likely—we continue to integrate AI everywhere and most humans come to rely completely and utterly on AI assistants for all important decisions, including all politicians/leaders/etc. Everything seems to be going great, the AI systems vasten, growth accelerates, etc, but there is mysteriously little progress in uploading or life extension, the decline in fertility accelerates, and in a few decades most of the economy and wealth is controlled entirely by de novo AI; bio humans are left behind and marginalized. AI won’t need to kill humans just as the US doesn’t need to kill the sentinelese. This clearly isn’t the worst possible future, but if our AI mind children inherit only our culture and leave us behind it feels more like a consolation prize vs what’s possible. We should aim much higher: for defeating death, across all of time, for resurrection and transcendence.
But how did you determine you were probably “piloting an airliner that has lost all control”?
No I do not.
It’s like EY is claiming that an upcoming nuclear bomb test is going to lite the atmosphere on fire, and i’m showing my calculations indicating that it will not. I do not intend or need to show that no future tech could ever ignite the atmosphere.
EY’s doom model—or more accurately my model of his model—is one where in the near future an AGI not much smarter than us running on normal hardware (ex GPUs) “rewrites its own source code” resulting in a noticeably more efficient AI which then improves the code further and so on, bottoming out in many OOM improvement in efficiency and then strong nanotech killing us.
I don’t think EY’s argument rests on the near-term viability of exotic (reversible or quantum) computing, and if it did that would be a weakness regardless. Analyzing the engineering feasibility and limits of just conventional computing was already an extensive full length post, analyzing the feasibility of reversible computing is more complex, but in short its not even clear/accepted in the engineering community that reversible computers are viable in practice. To a first approximation reversible computing is the field of a single lone researcher and some grad students (Mike Frank).
He writes that the human brain has “1e13-1e15 spikes through synapses per second (1e14-1e15 synapses × 0.1-1 spikes per second)”. I think Joe was being overly conservative, and I feel comfortable editing this to “1e13-1e14 spikes through synapses per second”, for reasons in this footnote→[9].
I agree that 1e14 synaptic spikes/second is the better median estimate, but those are highly sparse ops.
So when you say:
So I feel like 1e14 FLOP/s is a very conservative upper bound on compute requirements for AGI. And conveniently for my narrative, that number is about the same as the 8.3e13 FLOP/s that one can perform on the RTX 4090 retail gaming GPU that I mentioned in the intro.
You are missing some foundational differences in how von neumann arch machines (GPUs) run neural circuits vs how neuromorphic hardware (like the brain) runs neural circuits.
The 4090 can hit around 1e14 - even up to 1e15 - flops/s, but only for dense matrix multiplication. The flops required to run a brain model using that dense matrix hardware are more like 1e17 flops/s, not 1e14 flops/s. The 1e14 synapses are at least 10x locally sparse in the cortex, so dense emulation requires 1e15 synapses (mostly zeroes) running at 100hz. The cerebellum is actually even more expensive to simulate .. because of the more extreme connection sparsity there.
But that isn’t the only performance issue. The GPU only runs matrix matrix multiplication, not the more general vector matrix multiplication. So in that sense the dense flop perf is useless, and the perf would instead be RAM bandwidth limited and require 100 4090′s to run a single 1e14 synapse model—as it requires about 1B of bandwidth per flop—so 1e14 bytes/s vs the 4090′s 1e12 bytes/s.
Your reply seems to be “but the brain isn’t storing 1e14 bytes of information”, but as other comments point out that has little to do with the neural circuit size.
The true fundamental information capacity of the brain is probably much smaller than 1e14 bytes, but that has nothing to do with the size of an actually *efficient* circuit, because efficient circuits (efficient for runtime compute, energy etc) are never also efficient in terms of information compression.
This is a general computational principle, with many specific examples: compressed neural frequency encodings of 3D scenes (NERFs) which access/use all network parameters to decode a single point O(N) are enormously less computationally efficient (runtime throughput, latency, etc) than maximally sparse representations (using trees, hashtables etc) which approach O(log(N)) or O(C), but the sparse representations are enormously less compressed/compact. These tradeoffs are foundational and unavoidable.
We also know that in many cases the brain and some ANN are actually computing basically the same thing in the same way (LLMs and linguistic cortex), and it’s now obvious and uncontroversial that the brain is using the sparser but larger version of the same circuit, whereas the LLM ANN is using the dense version which is more compact but less energy/compute efficient (as it uses/accesses all params all the time).
I’m not too concerned about the karma—more the lack of interesting replies and general unjustified holier-than-though attitude. This idea is different than “that alien message” and I didn’t find a discussion of this on LW (not that it doesn’t exist—I just didn’t find it).
This is not my first post.
I posted this after I brought up the idea in a comment which at least one person found interesting.
I have spent significant time reading LW and associated writings before I ever created an account.
I’ve certainly read the AI-in-a-box posts, and the posts theorizing about the nature of smarter-than-human-intelligence. I also previously read “that alien message”, and since this is similar I should have linked to it.
I have a knowledge background that leads to somewhat different conclusions about A. the nature of intelligence itself, B. what ‘smarter’ even means, etc etc
Different backgrounds, different assumptions, so I listed my background and starting assumptions as they somewhat differ than the LW norm
Back to 3:
Remember, the whole plot device of “that alien message” revolved around a large and obvious grand reveal by the humans. If information can only flow into the sim world once (during construction), and then ever after can only flow out of the sim world, that plot device doesn’t work.
Trying to keep an AI boxed up where the AI knows that you exist is a fundamentally different problem than a box where the AI doesn’t even know you exist, doesn’t even know it is in a box, and may provably not even have enough information to know for certain whether it is in a box.
For example, I think the simulation argument holds water (we are probably in a sim), but I don’t believe there is enough information in our universe for us to discover much of anything about the nature of a hypothetical outside universe.
This of course doesn’t prove that my weak or strong Mind Prison conjectures are correct, but it at least reduces the problem down to “can we build a universe sim as good as this?”
The problem is not that we don’t know how to prevent power-seeking or instrumental convergence, because we want power-seeking and instrumental convergence.
Yes, this is still underappreciated in most alignment discourse, perhaps because power-seeking has unfortunate negative connotations. A better less loaded term might be Optionality-seeking. For example human friendships increase long term optionality (more social invites, social support, dating and business opportunities, etc), so a human trading some wealth for activities that increase and strengthen friendships can be instrumentally rational for optionality-maximizing empowerment, even though that doesn’t fit the (incorrect) stereotype of ‘power-seeking’.
The problem is that we don’t know how to align this power-seeking, how to direct the power towards what we want, rather than having side-effects that we don’t want.
Well if humans are also agents for which instrumental convergence applies, as you suggest here:
Imitation learning is useful due to Aumann’s Agreement Theorem and because instrumental convergence also applies to human intelligence
Then that suggests that we can use instrumental convergence to help solve alignment, because optimizing for human empowerment becomes equivalent to optimizing for our unknown long term values.
There are some caveats of course: we may still need to incorporate some model of short term values like hedonic reward, and it’s also important to identify the correct agency to empower which is probably not as simple as individual human brains. Humans are not purely selfish rational but instead are partially altruistic; handling that probably requires something like empowering humanity or generic agency more broadly, or empowering distributed software simulacra minds instead of brains.
Back when the sequences were written in 2007/2008 you could roughly partition the field of AI based on beliefs around the efficiency and tractability of the brain. Everyone in AI looked at the brain as the obvious single example of intelligence, but in very different lights.
If brain algorithms are inefficient and intractable[1] then neuroscience has little to offer, and instead more formal math/CS approaches are preferred. One could call this the rationalist approach to AI, or perhaps the “and everything else approach”. One way to end up in that attractor is by reading a bunch of ev psych; EY in 2007 was clearly heavily into Tooby and Cosmides, even if he has some quibbles with them on the source of cognitive biases.
From Evolutionary Psychology and the Emotions:
From the Psychological Foundations of Culture:
EY quotes this in LOGI, 2007 (p 4), immediately followed with:
Meanwhile in the field of neuroscience there was a growing body of evidence and momentum coalescing around exactly the “physics envy” approaches EY bemoans: the universal learning hypothesis, popularized to a wider audience in On Intelligence in 2004. It is pretty much pure tabula rosa, blank-slate, genericity and black-box.
The UL hypothesis is that the brain’s vast complexity is actually emergent, best explained by simple universal learning algorithms that automatically evolve all the complex domain specific circuits as required by the simple learning objectives and implied by the training data. (Years later I presented it on LW in 2015, and I finally got around to writing up the brain efficiency issue more recently—although I literally started the earlier version of that article back in 2012.)
But then the world did this fun experiment: the rationalist/non-connectivist AI folks got most of the attention and research money, but not all of it—and then various researcher groups did their thing and tried to best each other on various benchmarks. Eventually Nvidia released cuda, a few connectivists ported ANN code to their gaming GPUs which started to break imagenet, and then a little startup founded with the mission of reverse engineering the brain by some folks who met in a neuroscience program adapted that code to play Atari and later break Go; the rest is history—as you probably know.
Turns out the connectivists and the UL hypothesis were pretty much completely right after all—proven not only by the success of DL in AI, but also by how DL is transforming neuroscience. We know now that the human brain learns complex tasks like vision and language not through kludgy complex evolved mechanisms, but through the exact same simple approximate bayesian (self-supervised) learning algorithms that drive modern DL systems.
The sequences and associated materials were designed to “raise the rationality water line” and ultimately funnel promising new minds into AI-safety. And there they succeeded, especially in those earlier years. Finding an AI safety researcher today who isn’t familiar with the sequences and LW .. well maybe they exist? But they would be unicorns. ML-safety and even brain-safety approaches are now obviously more popular, but there is still this enormous bias/inertia in AI safety stemming from the circa 2007 beliefs and knowledge crystallized and distilled into the sequences.
It’s also possible to end up in the “brains are highly efficient, but completely intractable” camp, which implies uploading as the most likely path to AI—this is where Hanson is—and closer to my beliefs circa 2000 ish before I had studied much systems neuroscience.