Oliver Daniels comments on Oliver Daniels-Koch’s Shortform

Oliver Daniels 22 Sep 2025 17:01 UTC
10 points
0
In the small but growing literature on supervised document finetuning, its typical to finetune “post-trained” models on synthetic facts (see Alignment faking, Wang et al., Lessons from Two-Hop Reasoning)

To me, the more natural thing is “synthetic continued pretraining”—further training the base model on synthetic documents (mixed with pretraining data), then applying post-training techniques ~~(this is the approach used in~~ ~~Auditing language models for hidden objectives~~ nvm they apply SDF to post-trained model, then apply further post-training)

I’m sort of confused why more papers aren’t doing synthetic continued pretraining. I suspect its some combination of a) finetuing post-trained models is easier and b) people have tried both and it doesn’t make much of a difference.

But if its mostly a) and not really b), this would useful to know (and implies people should explore b more!)
- ryan_greenblatt 22 Sep 2025 22:50 UTC
  8 points
  6
  Parent
  It’s a cost and convenience thing. See discussion here for some ideas for better methods that are also cheap.
- Håvard Tveit Ihle 22 Sep 2025 17:34 UTC
  7 points
  6
  Parent
  If you have to repeat the entire post-training all the way from a base model, that is obviously a lot more work than just adding a small fine-tuning stage to an already post-trained model.
  
  The full post-training can also only really be done by a big lab which has their own full post-training stack. Post training is getting more and more advanced and complicated with each month.
  - Oliver Daniels 22 Sep 2025 19:04 UTC
    2 points
    0
    Parent
    yeah but its plausible this cost is worth paying if the effect size is large enough (and there are various open source instruction-tuning datasets which might reasonably recover e.g. Llama-3-instruct)
    - Håvard Tveit Ihle 22 Sep 2025 19:54 UTC
      1 point
      0
      Parent
      Yea, it could be worth it in some cases, if that is what you need for your experiment. In this case I would look for a completely open source llm project (where both the code and data are open), so that you know you are comparing apples to apples-with-your-additional-pretraining.