jacquesthibs comments on LLMs for Alignment Research: a safety priority?

jacquesthibs 12 Apr 2024 16:12 UTC
4 points
0
Hey Abram! I appreciate the post. We’ve talked about this at length, but this was still really useful feedback and re-summarization of the thoughts you shared with me. I’ve written up notes and will do my best to incorporate what you’ve shared into the tools I’m working on.
Since we last spoke, I’ve been focusing on technical alignment research, but I will dedicate a lot more time to LLMs for Alignment Research in the coming months.
For anyone reading this: If you are a great safety-minded software engineer and want to help make this vision a reality, please reach out to me. I need all the help I can get to implement this stuff much faster. I’m currently consolidating all of my notes based on what I’ve read, interviews with other alignment researchers, my own notes about what I’d find useful in my research, etc. I’ll be happy to share those notes with people who would love to know more about what I have in mind and potentially contribute.