Mmm… if we are not talking about full automation, but about being helpful, the ability to do 1-hour software engineering tasks (“train classifier”) is already useful.
Moreover, we had seen a recent flood of rather inexpensive fine-tunings of reasoning models for a particular benchmark.
Perhaps, what one can do is to perform a (somewhat more expensive, but still not too difficult) fine-tuning to create a model to help with a particular relatively narrow class of meaningful problems (which would be more general than tuning for particular benchmarks, but still reasonably narrow). So, instead of just using an off-the-shelf assistant, one should be able to upgrade it to a specialized one.
For example, I am sure that it is possible to create a model which would be quite helpful with a lot of mechanistic interpretability research.
So if we are taking about when AIs can start automating or helping with research, the answer is, I think, “now”.
Mmm… if we are not talking about full automation, but about being helpful, the ability to do 1-hour software engineering tasks (“train classifier”) is already useful.
Moreover, we had seen a recent flood of rather inexpensive fine-tunings of reasoning models for a particular benchmark.
Perhaps, what one can do is to perform a (somewhat more expensive, but still not too difficult) fine-tuning to create a model to help with a particular relatively narrow class of meaningful problems (which would be more general than tuning for particular benchmarks, but still reasonably narrow). So, instead of just using an off-the-shelf assistant, one should be able to upgrade it to a specialized one.
For example, I am sure that it is possible to create a model which would be quite helpful with a lot of mechanistic interpretability research.
So if we are taking about when AIs can start automating or helping with research, the answer is, I think, “now”.