RSS

evhub(Evan Hubinger)

Karma: 11,248

Evan Hubinger (he/​him/​his) (evanjhub@gmail.com)

I am a research scientist at Anthropic where I lead the Alignment Stress-Testing team. My posts and comments are my own and do not represent Anthropic’s positions, policies, strategies, or opinions.

Previously: MIRI, OpenAI

See: “Why I’m joining Anthropic

Selected work:

How to train your own “Sleeper Agents”

evhub7 Feb 2024 0:31 UTC
88 points
4 comments1 min readLW link