What I’ll be doing at MIRI

Note: This is a personal post describing my own plans, not a post with actual research content.

Having finished my internship working with Paul Christiano and others at OpenAI, I’ll be moving to doing research at MIRI. I’ve decided to do research at MIRI because I believe MIRI will be the easiest, most convenient place for me to continue doing research in the near future. That being said, there are a couple of particular aspects of what I’ll be doing at MIRI that I think are worth being explicit about.

First, and most importantly, this decision does not represent any substantive change in my beliefs regarding AI safety. In particular, my research continues to be focused around solving inner alignment for amplification. My post on relaxed adversarial training continues to represent a fairly up-to-date form of what I think needs to be done along these lines.

Second, my research will remain public by default. I have discussed with MIRI their decision to make their research non-disclosed-by-default and we agreed that my research agenda is a reasonable exception. I strongly believe in the importance of collaborating with both the AI safety and machine learning communities and thus believe in the need for sharing research. Of course, I also fully believe in the importance of carefully reviewing possible harmful effects from publishing before disclosing results—and will continue to do so with all of my research—though I will attempt to publish anything I don’t believe to pose a meaningful risk.

Third—and this should go without saying—I fully anticipate continuing to collaborate with other researchers at other institutions such as OpenAI, Ought, CHAI, DeepMind, FHI, etc. The task of making AGI safe is a huge endeavor that I fully believe will require the joint work of an entire field. If you are interested in working with me on anything (regarding inner alignment or anything else) please don’t hesitate to send me an email at evanjhub@gmail.com.