I think we can fine-tune on GPU nicely (fine-tuning is similar to short training runs and results in long-term crystallized knowledge).
But I do agree that the rate of progress here does depend on our progress in doing less uniform things faster (e.g. there are signs of progress in parallelization and acceleration of tree processing (think trees with labeled edges and numerical leaves, which are essentially flexible tensors), but this kind of progress is not mainstream yet, and is not common place yes, instead one has to look at rather obscure papers to see those accelerations of non-standard workloads).
I think this will be achieved (in part, because I somehow do expect less of “winner takes all” dynamics in the field of AI which we have currently; Transformers lead right now, so (almost) all eyes are on Transformers, other efforts attract less attention and resources; with artificial AI researchers not excessively overburdened by human motivations of career and prestige, one would expect better coverage of all possible directions of progress, less crowding around “the winner of the day”).
I think we can fine-tune on GPU nicely (fine-tuning is similar to short training runs and results in long-term crystallized knowledge).
But I do agree that the rate of progress here does depend on our progress in doing less uniform things faster (e.g. there are signs of progress in parallelization and acceleration of tree processing (think trees with labeled edges and numerical leaves, which are essentially flexible tensors), but this kind of progress is not mainstream yet, and is not common place yes, instead one has to look at rather obscure papers to see those accelerations of non-standard workloads).
I think this will be achieved (in part, because I somehow do expect less of “winner takes all” dynamics in the field of AI which we have currently; Transformers lead right now, so (almost) all eyes are on Transformers, other efforts attract less attention and resources; with artificial AI researchers not excessively overburdened by human motivations of career and prestige, one would expect better coverage of all possible directions of progress, less crowding around “the winner of the day”).