Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Ziqian Zhong
Karma:
69
I do technical AI interp and safety research.
All
Posts
Comments
New
Top
Old
ImpossibleBench: Measuring Reward Hacking in LLM Coding Agents
Ziqian Zhong
30 Oct 2025 2:52 UTC
60
points
5
comments
3
min read
LW
link
(arxiv.org)
Weight-diff SVD for LLM Monitoring
Ziqian Zhong
5 Aug 2025 0:31 UTC
2
points
0
comments
2
min read
LW
link
(arxiv.org)
Back to top