Supposing that we get your scenario where we have basically-aligned automated researchers (but haven’t somehow solved the whole alignment problem along the way). What’s your take on the “people will want to use automated researchers to create smarter, dangerous AI rather than using them to improve alignment” issue? Is your hope that automated researchers will be developed in one leading organization that isn’t embroiled in a race to the bottom, and that org will make a unified pivot to alignment work?
Supposing that we get your scenario where we have basically-aligned automated researchers (but haven’t somehow solved the whole alignment problem along the way). What’s your take on the “people will want to use automated researchers to create smarter, dangerous AI rather than using them to improve alignment” issue? Is your hope that automated researchers will be developed in one leading organization that isn’t embroiled in a race to the bottom, and that org will make a unified pivot to alignment work?