This contradicts METR timelines, which, IMO, is the best piece of info we currently have for predicting when AGI will arrive.
I disagree, METR’s extrapolations don’t attempt to account for important factors such as intermediate speedups from pre-superhuman systems and possible superexponentiality. METR folks agree with me that their paper is not trying to do what we are trying to do, i.e. predict timelines via a model that takes into account all relevant factors (they don’t necessarily endorse our model’s outputs!). See https://blog.ai-futures.org/p/beyond-the-last-horizon and our full timelines supplement for more.
A memory module that can be stored externally (on a hard drive) is handwaved as something that Just Works™, I don’t expect it to be so easy.
Yeah I mean it’s simplified a little in our story for brevity, of course in practice this advance will likely come after substantial experimentation, rather than on the first try, and there will be iterative improvements on top of early prototypes.
As of today, there is no robust anti-hallucination/error correction mechanism for LLMs. It seems like another thing that is handwaved as something that Just Works™: just beat the neural net with the RLHF stick until the outputs look about right.
I view this is as a continuous problem that is already getting better over time, and it’s valid to project this trend to continue. To be clear, we are not saying anywhere that techniques won’t change, certainly the training won’t look like vanilla RLHF (it already is moving away from this), we are expecting training techniques to iteratively improve over time.
I disagree, METR’s extrapolations don’t attempt to account for important factors such as intermediate speedups from pre-superhuman systems and possible superexponentiality. METR folks agree with me that their paper is not trying to do what we are trying to do, i.e. predict timelines via a model that takes into account all relevant factors (they don’t necessarily endorse our model’s outputs!). See https://blog.ai-futures.org/p/beyond-the-last-horizon and our full timelines supplement for more.
Yeah I mean it’s simplified a little in our story for brevity, of course in practice this advance will likely come after substantial experimentation, rather than on the first try, and there will be iterative improvements on top of early prototypes.
I view this is as a continuous problem that is already getting better over time, and it’s valid to project this trend to continue. To be clear, we are not saying anywhere that techniques won’t change, certainly the training won’t look like vanilla RLHF (it already is moving away from this), we are expecting training techniques to iteratively improve over time.