An interesting development is the development of synthetic data. This is also a sort of algorithmic improvement, because the data is generated by algorithms. For example in the verify step by step paper there is a combination of synthetic data and human labelling.
At first this seemed counter intuitive to me. The current model is being used to create data for the next model. Feels like bootstrapping. But it starts to make sense now. Better prompting (like CoT or ToT) is a method to get better data or a second model that is trained to pick the best answers from a thousand and that will get you data good enough to improve the model.
Demis Hassabis said in his interview with Lex Fridman that they used synthetic data when developing AlphaFold. They had some output of AlphaFold that they had great confidence in. Then they fed the output as input and the model improved (this gives your more data with great confidence, repeat).
An interesting development is the development of synthetic data. This is also a sort of algorithmic improvement, because the data is generated by algorithms. For example in the verify step by step paper there is a combination of synthetic data and human labelling.
At first this seemed counter intuitive to me. The current model is being used to create data for the next model. Feels like bootstrapping. But it starts to make sense now. Better prompting (like CoT or ToT) is a method to get better data or a second model that is trained to pick the best answers from a thousand and that will get you data good enough to improve the model.
Demis Hassabis said in his interview with Lex Fridman that they used synthetic data when developing AlphaFold. They had some output of AlphaFold that they had great confidence in. Then they fed the output as input and the model improved (this gives your more data with great confidence, repeat).