RSS

Ethan Perez(Ethan Perez)

Karma: 2,194

I’m a research scientist at Anthropic doing empirical safety research on language models. In the past, I’ve worked on automated red teaming of language models [1], the inverse scaling prize [2], learning from human feedback [3][4], and empirically testing debate [5][6], iterated amplification [7], and other methods [8] for scalably supervising AI systems as they become more capable.

Website: https://​​ethanperez.net/​​