Carson Denison

I work on deceptive alignment and reward hacking at Anthropic

Model Or­ganisms of Misal­ign­ment: The Case for a New Pillar of Align­ment Research

8 Aug 2023 1:30 UTC
[Question] How do I Op­ti­mize Team-Match­ing at Google

Carson Denison24 Feb 2022 22:10 UTC
