Thanks a lot for this article, Neel! As a recent Bachelor’s graduate embarking on this journey into AI Safety research on my own, this is very timely, especially by sharing steps, mindsets, and resources to keep in mind.
My core question revolves around “Stage 0”, even before starting to learn the basics. How much time would you recommend dedicating to exploring and deciding on a specific research agenda (like mechanistic interpretability vs others) before diving into learning the basics required for research in that sub-field?
Because my concern is that the learning paths and all that follows is very dependent on the chosen agenda, and committing to a direction inevitably involves a significant time investment. Therefore, for someone dedicating themselves full-time to AI Safety research from day 1, should this initial agenda exploration and prioritisation be in the order of hours, days, or weeks?
(Assuming a foundational understanding of the alignment problem, e.g., from the BlueDot course and other key resources.)
This resonates. I do have the tendency to read and think too much about whether this is the right thing to do (e.g., I keep following links and find an article that says “you should do this instead”, then I think “hmm maybe I should”, until I find the next advice that convinces me otherwise). And in the end, of course, I get much less done (and learn much less) than I would have if I had just tried something out.
Thanks for suggesting this shift in perspective towards action.