Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Punya Syon Pandey
Karma:
18
All
Posts
Comments
New
Top
Old
Explaining undesirable model behavior: (How) can influence functions help?
Zhijing Jin
,
TerryJCZhang
and
Punya Syon Pandey
2 Mar 2026 11:30 UTC
18
points
0
comments
3
min read
LW
link
Investigating Accidental Misalignment: Causal Effects of Fine-Tuning Data on Model Vulnerability
Zhijing Jin
,
Punya Syon Pandey
,
samuelsimko
and
Kellin Pelrine
11 Jun 2025 19:30 UTC
6
points
0
comments
5
min read
LW
link
Back to top