Cam comments on Alignment Pretraining: AI Discourse Causes Self-Fulfilling (Mis)alignment

Cam 5 Jan 2026 12:01 UTC
1 point
0
Update on this point—we’ve found that when conducting alignment pretraining on our unfiltered datasets we observe either equal or improved results when compared to adding positive synthetic data to our filtered pretraining datasets. These results seem akin to the findings from “When bad data leads to good models”. We plan to release a proper update within the next week.

This is a positive update for us wrt the ease at which alignment pretraining can be implemented.