Just left Vivek Hebbar’s team at MIRI, now doing various empirical alignment projects.
I’m looking for projects in interpretability, activation engineering, and control/oversight; DM me if you’re interested in working with me.
Just left Vivek Hebbar’s team at MIRI, now doing various empirical alignment projects.
I’m looking for projects in interpretability, activation engineering, and control/oversight; DM me if you’re interested in working with me.