Seth Herd

Karma: 153

I’ve been doing computational cognitive cognitive neuroscience since getting my PhD in 2006, until the end of 2022. I’ve worked on a bunch of brain systems, focusing on the emergent interactions that are needed to explain complex thought. I was increasingly concerned with AGI applications of the research, and reluctant to publish my best ideas. I’m incredibly excited to now be working directly on alignment, currently with generous funding from the Astera Institute. More info and publication list here.

Hu­man prefer­ences as RL critic val­ues—im­pli­ca­tions for alignment

Seth Herd14 Mar 2023 22:10 UTC
10 points
4 comments6 min readLW link