Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
frankyaoxiao
Karma:
13
All
Posts
Comments
New
Top
Old
Probe-Based Data Attribution: Surfacing and Mitigating Undesirable Behaviors in LLM Post-Training
Santiago Aranguri
and
frankyaoxiao
29 Apr 2026 19:30 UTC
16
points
0
comments
13
min read
LW
link
Back to top