As a ML engineer, I think it’s plausible. I also think there are some other factors that could act to cushion or mitigate slowdown. First, I think there are more low hanging fruit available. Now that we’ve seen what large transformer models can do on the text domain, and in a text-to-image Dall-E model, I think the obvious next step is to ingest large quantities of video data. We often talk about the sample inefficiency of modern methods as compared with humans, but I think humans are exposed to a TON of sensory data in building their world model. This seems an obvious next step. Though if hardware really stalls, maybe there won’t be enough compute or budget to train a 1T+ parameter multimodal model.
The second mitigating factor I think may be that funding has already been unlocked, to some extent. There is now a lot more money going around for basic research, possibly to the next big thing. The only thing that might stop it is maybe academic momentum into the wrong directions. Though from an x-risk standpoint, maybe that’s not a bad thing, heh.
In my mental model, if the large transformer models are already good enough to do what we’ve shown them to be able to do, it seems possible that the remaining innovations would be more on the side of engineering the right submodules and cost functions. Maybe something along the lines of Yann LeCun’s recent keynotes.
As a ML engineer, I think it’s plausible. I also think there are some other factors that could act to cushion or mitigate slowdown. First, I think there are more low hanging fruit available. Now that we’ve seen what large transformer models can do on the text domain, and in a text-to-image Dall-E model, I think the obvious next step is to ingest large quantities of video data. We often talk about the sample inefficiency of modern methods as compared with humans, but I think humans are exposed to a TON of sensory data in building their world model. This seems an obvious next step. Though if hardware really stalls, maybe there won’t be enough compute or budget to train a 1T+ parameter multimodal model.
The second mitigating factor I think may be that funding has already been unlocked, to some extent. There is now a lot more money going around for basic research, possibly to the next big thing. The only thing that might stop it is maybe academic momentum into the wrong directions. Though from an x-risk standpoint, maybe that’s not a bad thing, heh.
In my mental model, if the large transformer models are already good enough to do what we’ve shown them to be able to do, it seems possible that the remaining innovations would be more on the side of engineering the right submodules and cost functions. Maybe something along the lines of Yann LeCun’s recent keynotes.