The short version is getting compute-optimal experiments to self-improve yourself, training to do tasks that unavoidably take a really long time to learn/get data on because of real-world experimentation being necessary, combined with a potential hardware bottleneck on robotics that also requires real-life experimentation to overcome.
Another point is that to the extent you buy the scaling hypothesis at all, then compute bottlenecks will start to bite, and given that researchers will seek small constant improvements they don’t generalize, and this can start a cascade of wrong decisions that could take a very long time to get out of.
(My own opinion, stated without justification, is that LLMs are not a paradigm that can scale to ASI, but after some future AI paradigm shift, there will be very very little R&D separating “this type of AI can do anything importantly useful at all” and “full-blown superintelligence”. Like maybe dozens or hundreds of person-years, or whatever, as opposed to millions. More on this in a (hopefully) forthcoming post.)
I’d like to see that post, and I’d like to see your arguments on why it’s so easy for intelligence to be increased so fast, conditional on a new paradigm shift.
(For what it’s worth, I personally think LLMs might not be the last paradigm, because of their current lack of continuous learning/neuroplasticity plus no long term memory/state, but I don’t expect future paradigms to have an AlphaZero like trajectory curve, where things go from zero to wildly superhuman in days/weeks, though I do think takeoff is faster if we condition on a new paradigm being required for ASI, so I do see the AGI transition to plausibly include having only months until we get superintelligence, and maybe only 1-2 years before superintelligence starts having very, very large physical impacts through robotics, assuming that new paradigms are developed, so I’m closer to hundreds of person years/thousands of person years than dozens of person years).
The short version is getting compute-optimal experiments to self-improve yourself, training to do tasks that unavoidably take a really long time to learn/get data on because of real-world experimentation being necessary, combined with a potential hardware bottleneck on robotics that also requires real-life experimentation to overcome.
Another point is that to the extent you buy the scaling hypothesis at all, then compute bottlenecks will start to bite, and given that researchers will seek small constant improvements they don’t generalize, and this can start a cascade of wrong decisions that could take a very long time to get out of.
I’d like to see that post, and I’d like to see your arguments on why it’s so easy for intelligence to be increased so fast, conditional on a new paradigm shift.
(For what it’s worth, I personally think LLMs might not be the last paradigm, because of their current lack of continuous learning/neuroplasticity plus no long term memory/state, but I don’t expect future paradigms to have an AlphaZero like trajectory curve, where things go from zero to wildly superhuman in days/weeks, though I do think takeoff is faster if we condition on a new paradigm being required for ASI, so I do see the AGI transition to plausibly include having only months until we get superintelligence, and maybe only 1-2 years before superintelligence starts having very, very large physical impacts through robotics, assuming that new paradigms are developed, so I’m closer to hundreds of person years/thousands of person years than dozens of person years).