We’re running some followup experiments on this now—we have some preliminary results that show conducting filtered + positive upsampled midtraining on a model trained on an unfiltered pretraining dataset has similar affects to our results from training on a filtered pretraining dataset. But this isn’t a perfectly clean comparison, so we’re running unfiltered pretaining and unfitlered midtraining + positive synthetic documents now.
Thanks for the interest! We plan to publish the “main release” of our paper on arXiv in the coming weeks. This release will include several new experiments and revisions based on the excellent community feedback we’ve received.
Update on this point—we’ve found that when conducting alignment pretraining on our unfiltered datasets we observe either equal or improved results when compared to adding positive synthetic data to our filtered pretraining datasets. These results seem akin to the findings from “When bad data leads to good models”. We plan to release a proper update within the next week.
This is a positive update for us wrt the ease at which alignment pretraining can be implemented.
We’re running some followup experiments on this now—we have some preliminary results that show conducting filtered + positive upsampled midtraining on a model trained on an unfiltered pretraining dataset has similar affects to our results from training on a filtered pretraining dataset. But this isn’t a perfectly clean comparison, so we’re running unfiltered pretaining and unfitlered midtraining + positive synthetic documents now.
Cool ty! Do you plan to post updates in a new post / a paper based on this?
Thanks for the interest! We plan to publish the “main release” of our paper on arXiv in the coming weeks. This release will include several new experiments and revisions based on the excellent community feedback we’ve received.
Update on this point—we’ve found that when conducting alignment pretraining on our unfiltered datasets we observe either equal or improved results when compared to adding positive synthetic data to our filtered pretraining datasets. These results seem akin to the findings from “When bad data leads to good models”. We plan to release a proper update within the next week.
This is a positive update for us wrt the ease at which alignment pretraining can be implemented.