Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Carson Denison
Karma:
286
I work on deceptive alignment and reward hacking at Anthropic
All
Posts
Comments
New
Top
Old
Model Organisms of Misalignment: The Case for a New Pillar of Alignment Research
evhub
,
Nicholas Schiefer
,
Carson Denison
and
Ethan Perez
8 Aug 2023 1:30 UTC
291
points
24
comments
18
min read
LW
link
[Question]
How do I Optimize Team-Matching at Google
Carson Denison
24 Feb 2022 22:10 UTC
8
points
1
comment
1
min read
LW
link
Back to top