I’d be interesting in understanding the discrepency between
“Figure 4: Pretraining effects persist through post-training” from this blog post
and”Figure 5: Pretraining effects persist through post-training” from the PDF
for the unfilt + misalignment bar.
Thanks!
We used end-to-end training for the main draft! There were also some slight changes in prompting to reduce ordering bias from the PDF to the arXiv.
I’d be interesting in understanding the discrepency between
“Figure 4: Pretraining effects persist through post-training” from this blog post
and
”Figure 5: Pretraining effects persist through post-training” from the PDF
for the unfilt + misalignment bar.
Thanks!
We used end-to-end training for the main draft! There were also some slight changes in prompting to reduce ordering bias from the PDF to the arXiv.