Imagine if evolution could talk. “Yes, humans are very intelligent, but surely they couldn’t create airplanes 50,000 times heavier than the biggest bird in only 1,000 years. Evolution takes millions of years, and even if you can speed up some parts of the process, other parts will remain necessarily slow.”
But maybe the most ambitious humans do not even consider waiting millions of years, and making incremental improvements on million year techniques. Instead, they see any technique which takes a million years as a “deal breaker,” and only make use of techniques which they can use within the timespan of years. Yet humans are smart enough and think fast enough that even when they restrict themselves to these faster techniques, they can still eventually build an airplane, one much heavier than birds.
Likewise, an AI which is smart enough and thinks fast enough, might still eventually invent a smarter AI, one much smarter than itself, even when restricted to techniques which don’t require months of experimentation (analogous to evolution). Maybe just by training very small models very quickly, they can discover a ton of new technologies which can scale to large models. State-of-the-art small models (DeepSeek etc.) already outperform old large models. Maybe they can invent new architectures, new concepts, and who knows what.
In real life, there might be no fine line between slow techniques and fast techniques, but a gradual transition from approaches which use more slower techniques and approaches which use less slower techniques.
This is valid, but doesn’t really engage with the specific arguments here. By definition, when we consider the potential for AI to accelerate the path to ASI, we are contemplating the capabilities of something that is not a full ASI. Today’s models have extremely jagged capabilities, with lots of holes, and (I would argue) they aren’t anywhere near exhibiting sophisticated high-level planning skills able to route around their own limitations. So the question becomes, what is the shape of the curve of AI filling in weak capabilities and/or developing sophisticated strategies for routing around those weaknesses?
Maybe just by training very small models very quickly, they can discover a ton of new technologies which can scale to large models.
This is exactly missing the point. Training a cutting-edge model today involves a broad range of activities, not all of which fall under the heading of “discovering technologies” or “improving algorithms” or whatever. I am arguing that if all you can do is find better algorithms rapidly, that’s valuable but it’s not going to speed up overall progress by very large factors. Also, it may be that “by training very small models very quickly”, the AI would discover new technologies that improve some aspects of models but fail to advance some other important aspects.
Yeah, sorry I didn’t mean to argue that Amdahl’s Law and Hofstadter’s Law are irrelevant, or that things are unlikely to go slowly.
I see a big chance that it takes a long time, and that I end up saying you were right and I was wrong.
However, if you’re talking about “contemplating the capabilities of something that is not a full ASI. Today’s models have extremely jagged capabilities, with lots of holes, and (I would argue) they aren’t anywhere near exhibiting sophisticated high-level planning skills able to route around their own limitations.”
That seems to apply to the 2027 “Superhuman coder” with 5x speedup, not the “Superhuman AI researcher” with 25x speedup or “Superintelligent AI researcher” with 250x.
I think “routing around one’s own limitations” isn’t necessarily that sophisticated. Even blind evolution does it, by trying something else when one thing fails.
As long as the AI is “smart enough,” even if they aren’t that superhuman, they have the potential to think many times faster than a human, with a “population” many times greater than that of AI researchers. They can invent a lot more testable ideas and test them all.
Maybe I’m missing the point, but it’s possible that we simply disagree on whether the point exists. You believe that merely discovering technologies and improving algorithms isn’t sufficient to build ASI, while I believe there is a big chance that doing that alone will be sufficient. After discovering new technologies from training smaller models, they may still need one or two large training runs to implement it all.
I’m not arguing that you don’t have a good insights :)
Imagine if evolution could talk. “Yes, humans are very intelligent, but surely they couldn’t create airplanes 50,000 times heavier than the biggest bird in only 1,000 years. Evolution takes millions of years, and even if you can speed up some parts of the process, other parts will remain necessarily slow.”
But maybe the most ambitious humans do not even consider waiting millions of years, and making incremental improvements on million year techniques. Instead, they see any technique which takes a million years as a “deal breaker,” and only make use of techniques which they can use within the timespan of years. Yet humans are smart enough and think fast enough that even when they restrict themselves to these faster techniques, they can still eventually build an airplane, one much heavier than birds.
Likewise, an AI which is smart enough and thinks fast enough, might still eventually invent a smarter AI, one much smarter than itself, even when restricted to techniques which don’t require months of experimentation (analogous to evolution). Maybe just by training very small models very quickly, they can discover a ton of new technologies which can scale to large models. State-of-the-art small models (DeepSeek etc.) already outperform old large models. Maybe they can invent new architectures, new concepts, and who knows what.
In real life, there might be no fine line between slow techniques and fast techniques, but a gradual transition from approaches which use more slower techniques and approaches which use less slower techniques.
This is valid, but doesn’t really engage with the specific arguments here. By definition, when we consider the potential for AI to accelerate the path to ASI, we are contemplating the capabilities of something that is not a full ASI. Today’s models have extremely jagged capabilities, with lots of holes, and (I would argue) they aren’t anywhere near exhibiting sophisticated high-level planning skills able to route around their own limitations. So the question becomes, what is the shape of the curve of AI filling in weak capabilities and/or developing sophisticated strategies for routing around those weaknesses?
This is exactly missing the point. Training a cutting-edge model today involves a broad range of activities, not all of which fall under the heading of “discovering technologies” or “improving algorithms” or whatever. I am arguing that if all you can do is find better algorithms rapidly, that’s valuable but it’s not going to speed up overall progress by very large factors. Also, it may be that “by training very small models very quickly”, the AI would discover new technologies that improve some aspects of models but fail to advance some other important aspects.
Yeah, sorry I didn’t mean to argue that Amdahl’s Law and Hofstadter’s Law are irrelevant, or that things are unlikely to go slowly.
I see a big chance that it takes a long time, and that I end up saying you were right and I was wrong.
However, if you’re talking about “contemplating the capabilities of something that is not a full ASI. Today’s models have extremely jagged capabilities, with lots of holes, and (I would argue) they aren’t anywhere near exhibiting sophisticated high-level planning skills able to route around their own limitations.”
That seems to apply to the 2027 “Superhuman coder” with 5x speedup, not the “Superhuman AI researcher” with 25x speedup or “Superintelligent AI researcher” with 250x.
I think “routing around one’s own limitations” isn’t necessarily that sophisticated. Even blind evolution does it, by trying something else when one thing fails.
As long as the AI is “smart enough,” even if they aren’t that superhuman, they have the potential to think many times faster than a human, with a “population” many times greater than that of AI researchers. They can invent a lot more testable ideas and test them all.
Maybe I’m missing the point, but it’s possible that we simply disagree on whether the point exists. You believe that merely discovering technologies and improving algorithms isn’t sufficient to build ASI, while I believe there is a big chance that doing that alone will be sufficient. After discovering new technologies from training smaller models, they may still need one or two large training runs to implement it all.
I’m not arguing that you don’t have a good insights :)