Daniel Tan comments on Daniel Tan’s Shortform

Daniel Tan 7 Feb 2025 10:00 UTC
2 points
1
Finetuning could be an avenue for transmitting latent knowledge between models.
As AI-generated text increasingly makes its way onto the Internet, it seems likely that we’ll finetune AI on text generated by other AI. If this text contains opaque meaning—e.g. due to steganography or latent knowledge—then finetuning could be a way in which latent knowledge propagates between different models.
- Milan W 7 Feb 2025 11:28 UTC
  1 point
  0
  Parent
  What is included within “latent knowledge” here?
  Does it include both knowledge encoded in M1′s weights and knowledge introduced in-context while running it?
  - Daniel Tan 7 Feb 2025 11:38 UTC
    2 points
    0
    Parent
    I’m imagining it’s something encoded in M1′s weights. But as a cheap test you could add in latent knowledge via the system prompt and then see if finetuning M2 on M1′s generations results in M2 having the latent knowledge