kave comments on Subliminal Learning: LLMs Transmit Behavioral Traits via Hidden Signals in Data

kave 2 Aug 2025 0:13 UTC
10 points
4
Curated. I thought this was a pretty interesting result. I’m not sure if I should have been surprised by it, but I was. They also point to a decent amount of interesting follow-up work, though I expect that to generalise less well than “existence proof” papers like this one.
I often enjoy this group’s work finding interesting “info leaks” in LLM behaviour, like their previous lie detector work.