When I first heard and thought about AI takeoff I found the argument convincing that as soon as an AI passed IQ 100, takeoff would become hyper exponentially fast. Progress would speed up, which would then compound on itself etc. However there other possibilities.
AGI is a barrier that requires >200 IQ to pass unless we copy biology?
Progress could be discontinuous, there could be IQ thresholds required to unlock better methods or architectures. Say we fixed our current compute capability, and with fixed human intelligence we may not be able to figure out the formula for AGI, in a similar way that the combined human intelligence hasn’t cracked many hard problems even with decades and the worlds smartest minds working on them (maths problems, Quantum gravity...). This may seem unlikely for AI, but to illustrate the principle, say we only allowed IQ<90 people to work on AI. Progress would stall. So IQ <90 software developers couldn’t unlock IQ>90 AI. Can IQ 160 developers with our current compute hardware unlock >160 AI?
To me the reason we don’t have AI now is that the architecture is very data inefficient and worse at generalization than say the mammalian brain, for example a cortical column. I expect that if we knew the neural code and could copy it, then we would get at least to very high human intelligence quickly as we have the compute.
From watching AI over my career it seems to be that even the highest IQ people and groups cant make progress by themselves without data, compute and biology to copy for guidance, in contrast to other fields. For example Einstein predicted gravitational waves long before they where discovered, but Turing or Von Neumann didn’t publish the Transformer architecture or suggest backpropagation. If we did not have access to neural tissue, would we still not have artificial NN? In a related note, I think there is an XKCD cartoon that says something like the brain has to be so complex that it cannot understand itself.
(I believe now that progress in theoretical physics and pure maths is slowing to a stall as further progress requires intellectual capacity beyond the combined ability of humanity. Without AI there will be no major advances in physics anymore even with ~100 years spent on it.)
After AGI is there another threshold?
Lets say we do copy biology/solve AGI and with our current hardware can get >10,000 AGI agents with >= IQ of the smartest humans. They then optimize the code so there is 100K agents with the same resources. but then optimization stalls. The AI wouldn’t know if it was because it had optimized as much as possible, or because it lacked the ability to find a better optimization.
Does our current system scale to AGI with 1GW/1 million GPU?
Lets say we don’t copy biology, but scaling our current systems to 1GW/1 million GPU and optimizing for a few years gets us to IQ 160 at all tasks. We would have an inferior architecture compensated by a massive increase in energy/FLOPS as compared to the human brain. Progress could theoretically stall at upper level human IQ for a time rather then takeoff. (I think this isn’t very likely however) There would of course be a significant overhang where capabilities would increase suddenly when the better architecture was found and applied to the data center hosting the AI.
Related note—why 1GW data centers won’t be a consistent requirement for AI leadership.
Based on this, then a 1GW or similar data center isn’t useful or necessary for long. If it doesn’t give a significant increase in capabilities, then it won’t be cost effective. If it does, then it would optimize itself so that such power isn’t needed anymore. Only in a small range of capability increase does it actually stay around.
To me its not clear the merits of the Pause movement and training compute caps. Someone here made the case that compute caps could actually speed up AGI as people would then pay more attention to finding better architectures rather than throwing resources into scaling existing inferior ones. However all things considered I can see a lot of downsides from large data centers and little upside. I see a specific possibility where they are build, don’t give the economic justification, decrease in value a lot, then are sold to owners that are not into cutting edge AI. Then when the more efficient architecture is discovered, they are suddenly very powerful without preparation. Worldwide caps on total GPU production would also help reduce similar overhang possibilities.
Grothendieck and von Neumann were built using evolution, not deep basic science or even engineering. So in principle all that’s necessary is compute, tinkering, and evals, everything else is about shortening timelines and reducing requisite compute.
Any form of fully autonomous industry lets compute grow very quickly, in a way not constrained by human population, and only requires AI with ordinary engineering capabilities. Fusion and macroscopic biotech[1] (or nanotech) potentially get compute to grow much faster than that. To the extent human civilization would hypothetically get there in 100-1000 years without general AI, serial speedup alone might be able to get such tech via general AIs within years, even without superintelligence.
Drosophila biomass doubles every 3 days. Small things can quickly assemble into large things, transforming through metamorphosis. This is proven technology, doesn’t depend on untested ideas about what is possible like nanotech does. Industry and compute that double every 3 days can quickly eat the Solar System.
Yes the human brain was built using evolution, I have no disagreement that give us 100-1000 years with just tinkering etc we would likely get AGI. Its just that in our specific case we have bio to copy and it will get us there much faster.
Evolution is an argument that there is no barrier, even with very incompetent tinkerers that fail to figure things out (and don’t consider copying biology). So it doesn’t take an arbitrarily long time, and takes less with enough compute[1]. The 100-1000 years figure was about the fusion and macroscopic biotech milestone in the hypothetical of no general AI, which with general AI running at a higher speed becomes 0.1-10 years.
Temporarily adopting this sort of model of “AI capabilities are useful compared to human IQs”:
With IQ 100 AGI (i.e. could do about the same fraction of tasks as well as a sample of IQ 100 humans), progress may well be hyper exponentially fast: but the lead-in to a hyper-exponentially fast function could be very, very slow. The majority of even relatively incompetent humans in technical fields like AI development have greater than IQ 100. Eventually quantity may have a quality of its own, e.g. after there were very large numbers of these sub-par researcher equivalents running at faster than human and coordinated better than I would expect average humans to be.
Absent enormous numerical or speed advantages, I wouldn’t expect substantial changes in research speed until something vaguely equivalent to IQ 160 or so.
Though in practice, I’m not sure that human measures of IQ are usefully applicable to estimating rates of AI-assisted research. They are not human, and only hindsight could tell what capabilities turn out to be the most useful to advancing research. A narrow tool along the lines of AlphaFold could turn out to be radically important to research rate without having anything that you could characterize as IQ. On the other hand, it may turn out that exceeding human research capabilities isn’t practically possible from any system pretrained on material steeped in existing human paradigms and ontology.
Perhaps thinking about IQ conflates too things: correctness and speed. For individual humans, these seem correlated: people with higher IQ are usually able to get more correct results, more quickly.
But it becomes relevant when talking about groups of people: Whether a group of average people is better than a genius, depends on the nature of the task. The genius will be better at doing novel research. The group of normies will be better at doing lots of trivial paperwork.
Currently, the AIs seem comparable to having an army of normies on steroids.
The performance of a group of normies (literal or metaphorical) can sometimes be improved by error checking. For example, if you have them solve mathematical problems, they will probably make a lot of errors; adding more normies would allow you to solve more problems, but the fraction of correct solutions would remain the same. But if you give them instructions how the verify the solutions, you could increase the correctness (at a cost of slowing them down somewhat). Similarly, an LLM can give me hallucinated solutions to math / programming problems, but that is less of a concern if I can verify the solutions in Lean / using unit tests, and reject the incorrect ones; and who knows, maybe trying again will result in a better solution. (In a hypothetical extreme case, an army of monkeys with typewriters could produce Shakespeare, if we had a 100% reliable automatic verifier of their outputs.)
So it seems to me, the question is how much we can compensate for the errors caused by “lower IQ”. Depending on the answer, that’s how long we have to wait until the AIs become that intelligent.
More important is “size of a team of humans” vs “peak capabilities of a team of humans” (or maybe sum of the cubes of their IQs?)
A given person thinking faster than average is more equivalent to a multiplier on the number of people of exactly that intelligence you have.
(Of course, if you could radically increase brain speed, I would expect IQ to increase rather than remain constant, but that isn’t yet an option for humans).
Types of takeoff
When I first heard and thought about AI takeoff I found the argument convincing that as soon as an AI passed IQ 100, takeoff would become hyper exponentially fast. Progress would speed up, which would then compound on itself etc. However there other possibilities.
AGI is a barrier that requires >200 IQ to pass unless we copy biology?
Progress could be discontinuous, there could be IQ thresholds required to unlock better methods or architectures. Say we fixed our current compute capability, and with fixed human intelligence we may not be able to figure out the formula for AGI, in a similar way that the combined human intelligence hasn’t cracked many hard problems even with decades and the worlds smartest minds working on them (maths problems, Quantum gravity...). This may seem unlikely for AI, but to illustrate the principle, say we only allowed IQ<90 people to work on AI. Progress would stall. So IQ <90 software developers couldn’t unlock IQ>90 AI. Can IQ 160 developers with our current compute hardware unlock >160 AI?
To me the reason we don’t have AI now is that the architecture is very data inefficient and worse at generalization than say the mammalian brain, for example a cortical column. I expect that if we knew the neural code and could copy it, then we would get at least to very high human intelligence quickly as we have the compute.
From watching AI over my career it seems to be that even the highest IQ people and groups cant make progress by themselves without data, compute and biology to copy for guidance, in contrast to other fields. For example Einstein predicted gravitational waves long before they where discovered, but Turing or Von Neumann didn’t publish the Transformer architecture or suggest backpropagation. If we did not have access to neural tissue, would we still not have artificial NN? In a related note, I think there is an XKCD cartoon that says something like the brain has to be so complex that it cannot understand itself.
(I believe now that progress in theoretical physics and pure maths is slowing to a stall as further progress requires intellectual capacity beyond the combined ability of humanity. Without AI there will be no major advances in physics anymore even with ~100 years spent on it.)
After AGI is there another threshold?
Lets say we do copy biology/solve AGI and with our current hardware can get >10,000 AGI agents with >= IQ of the smartest humans. They then optimize the code so there is 100K agents with the same resources. but then optimization stalls. The AI wouldn’t know if it was because it had optimized as much as possible, or because it lacked the ability to find a better optimization.
Does our current system scale to AGI with 1GW/1 million GPU?
Lets say we don’t copy biology, but scaling our current systems to 1GW/1 million GPU and optimizing for a few years gets us to IQ 160 at all tasks. We would have an inferior architecture compensated by a massive increase in energy/FLOPS as compared to the human brain. Progress could theoretically stall at upper level human IQ for a time rather then takeoff. (I think this isn’t very likely however) There would of course be a significant overhang where capabilities would increase suddenly when the better architecture was found and applied to the data center hosting the AI.
Related note—why 1GW data centers won’t be a consistent requirement for AI leadership.
Based on this, then a 1GW or similar data center isn’t useful or necessary for long. If it doesn’t give a significant increase in capabilities, then it won’t be cost effective. If it does, then it would optimize itself so that such power isn’t needed anymore. Only in a small range of capability increase does it actually stay around.
To me its not clear the merits of the Pause movement and training compute caps. Someone here made the case that compute caps could actually speed up AGI as people would then pay more attention to finding better architectures rather than throwing resources into scaling existing inferior ones. However all things considered I can see a lot of downsides from large data centers and little upside. I see a specific possibility where they are build, don’t give the economic justification, decrease in value a lot, then are sold to owners that are not into cutting edge AI. Then when the more efficient architecture is discovered, they are suddenly very powerful without preparation. Worldwide caps on total GPU production would also help reduce similar overhang possibilities.
Grothendieck and von Neumann were built using evolution, not deep basic science or even engineering. So in principle all that’s necessary is compute, tinkering, and evals, everything else is about shortening timelines and reducing requisite compute.
Any form of fully autonomous industry lets compute grow very quickly, in a way not constrained by human population, and only requires AI with ordinary engineering capabilities. Fusion and macroscopic biotech[1] (or nanotech) potentially get compute to grow much faster than that. To the extent human civilization would hypothetically get there in 100-1000 years without general AI, serial speedup alone might be able to get such tech via general AIs within years, even without superintelligence.
Drosophila biomass doubles every 3 days. Small things can quickly assemble into large things, transforming through metamorphosis. This is proven technology, doesn’t depend on untested ideas about what is possible like nanotech does. Industry and compute that double every 3 days can quickly eat the Solar System.
Yes the human brain was built using evolution, I have no disagreement that give us 100-1000 years with just tinkering etc we would likely get AGI. Its just that in our specific case we have bio to copy and it will get us there much faster.
Evolution is an argument that there is no barrier, even with very incompetent tinkerers that fail to figure things out (and don’t consider copying biology). So it doesn’t take an arbitrarily long time, and takes less with enough compute[1]. The 100-1000 years figure was about the fusion and macroscopic biotech milestone in the hypothetical of no general AI, which with general AI running at a higher speed becomes 0.1-10 years.
Moore’s Law of Mad Science: Every 18 months, the minimum IQ to destroy the world drops by one point.
Temporarily adopting this sort of model of “AI capabilities are useful compared to human IQs”:
With IQ 100 AGI (i.e. could do about the same fraction of tasks as well as a sample of IQ 100 humans), progress may well be hyper exponentially fast: but the lead-in to a hyper-exponentially fast function could be very, very slow. The majority of even relatively incompetent humans in technical fields like AI development have greater than IQ 100. Eventually quantity may have a quality of its own, e.g. after there were very large numbers of these sub-par researcher equivalents running at faster than human and coordinated better than I would expect average humans to be.
Absent enormous numerical or speed advantages, I wouldn’t expect substantial changes in research speed until something vaguely equivalent to IQ 160 or so.
Though in practice, I’m not sure that human measures of IQ are usefully applicable to estimating rates of AI-assisted research. They are not human, and only hindsight could tell what capabilities turn out to be the most useful to advancing research. A narrow tool along the lines of AlphaFold could turn out to be radically important to research rate without having anything that you could characterize as IQ. On the other hand, it may turn out that exceeding human research capabilities isn’t practically possible from any system pretrained on material steeped in existing human paradigms and ontology.
Perhaps thinking about IQ conflates too things: correctness and speed. For individual humans, these seem correlated: people with higher IQ are usually able to get more correct results, more quickly.
But it becomes relevant when talking about groups of people: Whether a group of average people is better than a genius, depends on the nature of the task. The genius will be better at doing novel research. The group of normies will be better at doing lots of trivial paperwork.
Currently, the AIs seem comparable to having an army of normies on steroids.
The performance of a group of normies (literal or metaphorical) can sometimes be improved by error checking. For example, if you have them solve mathematical problems, they will probably make a lot of errors; adding more normies would allow you to solve more problems, but the fraction of correct solutions would remain the same. But if you give them instructions how the verify the solutions, you could increase the correctness (at a cost of slowing them down somewhat). Similarly, an LLM can give me hallucinated solutions to math / programming problems, but that is less of a concern if I can verify the solutions in Lean / using unit tests, and reject the incorrect ones; and who knows, maybe trying again will result in a better solution. (In a hypothetical extreme case, an army of monkeys with typewriters could produce Shakespeare, if we had a 100% reliable automatic verifier of their outputs.)
So it seems to me, the question is how much we can compensate for the errors caused by “lower IQ”. Depending on the answer, that’s how long we have to wait until the AIs become that intelligent.
In fact, speed and accuracy in humans are at least somewhat mechanistically different
More important is “size of a team of humans” vs “peak capabilities of a team of humans” (or maybe sum of the cubes of their IQs?) A given person thinking faster than average is more equivalent to a multiplier on the number of people of exactly that intelligence you have. (Of course, if you could radically increase brain speed, I would expect IQ to increase rather than remain constant, but that isn’t yet an option for humans).