Evan Hubinger (he/him/his) (evanjhub@gmail.com)
I am a research scientist at Anthropic where I lead the Alignment Stress-Testing team. My posts and comments are my own and do not represent Anthropic’s positions, policies, strategies, or opinions.
Previously: MIRI, OpenAI
See: “Why I’m joining Anthropic”
Selected work:
I still really like my framework here! I think this post ended up popularizing some of the ontology I developed here, but the unfortunate thing about that post as the one that popularized this is that it doesn’t really provide an alternative.