Charlie Steiner comments on It’s hard to make scheming evals look realistic for LLMs

Charlie Steiner 24 May 2025 20:10 UTC
9 points
−1
I wonder if there’s some accidental steganography—if you use an LLM to rewrite the shorter scenario, and maybe it has “this is a test” features active while doing that, nudging the text towards sounding like a test.
- Igor Ivanov 28 May 2025 15:26 UTC
  1 point
  0
  Parent
  Hm. I’m unsure why would LLM do stenography in this case.
  - Martin Randall 6 Aug 2025 23:27 UTC
    3 points
    0
    Parent
    The subliminal owls result? I know this isn’t distillation but there could be the same effect on prompts.
    - Igor Ivanov 6 Aug 2025 23:46 UTC
      1 point
      0
      Parent
      Might be