Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Nathan Hu
Karma:
7
All
Posts
Comments
New
Top
Old
Training on Documents About Reward Hacking Induces Reward Hacking
evhub
and
Nathan Hu
21 Jan 2025 21:32 UTC
131
points
15
comments
2
min read
LW
link
(alignment.anthropic.com)
Back to top