[APPRENTICE] I’d be interested in starting to work in AI alignment, and aligning language models sounds particularly interesting.
I’m an incoming new grad SWE at Google and have a degree in Statistics and Data Science, to give an idea of my background and skillset. I have a modest amount of experience with machine/deep learning, mainly through coursework and a few personal projects (taken and TAd a class on Natural Language Processing, taken a survey Deep Learning class, wrote my undergraduate thesis on applying neural networks to causal inference problems). I’ve read a lot of the Alignment Forum and LessWrong sequences on AI alignment (Iterated Amplification, Value Learning, etc.) and would like to move into AI capability and safety research at some point in my career.
I may not have the right background or enough experience to be a useful apprentice for you but if you’d like to get in touch, you can reach me at anshuman.radhakrishnan@gmail.com
[MENTOR] Research on aligning language models. This includes developing strategies for:
Training language models to answer questions correctly, even when we don’t have access to the correct answer during training (as in my past work on training models to debate or decomposing questions into subquestions)
Learning from human feedback (e.g., from crowdworkers or upvotes) to generate helpful/truthful text
Evaluating how aligned language models are (e.g., by studying how well they generalize), and evaluating the effectiveness of different alignment techniques (like generating subquestions)
[APPRENTICE] I’d be interested in starting to work in AI alignment, and aligning language models sounds particularly interesting.
I’m an incoming new grad SWE at Google and have a degree in Statistics and Data Science, to give an idea of my background and skillset. I have a modest amount of experience with machine/deep learning, mainly through coursework and a few personal projects (taken and TAd a class on Natural Language Processing, taken a survey Deep Learning class, wrote my undergraduate thesis on applying neural networks to causal inference problems). I’ve read a lot of the Alignment Forum and LessWrong sequences on AI alignment (Iterated Amplification, Value Learning, etc.) and would like to move into AI capability and safety research at some point in my career.
I may not have the right background or enough experience to be a useful apprentice for you but if you’d like to get in touch, you can reach me at anshuman.radhakrishnan@gmail.com
[Apprentice] I’m interested in this, may be over committed at the moment, but would really enjoy an intro chat.