I wish you had said ambitious AI safety research directions that you’re glad (at least ex ante) that people have pursued; it’s hard for me to otherwise know what I think of this.
I am glad ex ante about ARC’s work, which counts. I think mech interp was probably pretty bad for the AI safety field overall (because of opportunity cost of people who worked on it), and I think its vibe of ambitiousness probably made the research go substantially worse (e.g. I wish people had either had the pragmatic interpretability attitude earlier, or that they’d been much more careful when making claims about their progress on ambitious goals). I can’t think of ambitious research directions that I wish were more strongly incentivized by the AI safety community; I think LWers are overall too enthusiastic and credulous about novel ambitious research directions.
Unfortunately, job and funding incentives are biased against research ambition.
I agree with job incentives (because many jobs are at AI companies that end up pushing people to contribute to the problems that are currently important), but I don’t really agree re funding incentives.
There are two (overlapping) groups that I wanted to thank and incentivise which I bundled together in the OP. In hindsight I probably should have separated them out.
1. Those that pursue research because they think it’s valuable and are foregoing big bucks and external credibility. This could be ambitious or incremental work.
2. Those that aim to solve fundamental problems that would lead to step/changes in our ability to control/guide AI.
Point 1 is uncontroversial. 2 is cruxy to the extent that people disagree about the expected value safety-wise of ambitious and incremental work.
My guess is that I think E(ambitious) - E(incremental) is larger than you do. Miscellaneous grab bag of ambitious work that I think is or was high EV:
- some mech interp (fwiw I’d call work that e.g. applies SAEs to a new problem or tries to minorly improve them incremental research rather than ambitious.)
- deep learning theory
- agent foundations
- Steven Byrnes brain-like AGI stuff
Then yeah I also claimed that jobs and funding are biased against ambitious research. As you mentioned, the case for jobs is clear. RE funding I think it’s very hard to evaluate ambitious proposals that typically don’t have good short-term milestones. It at least seems like CG are actively trying to overcome that. More varied funding sources would help too.
I think I’m glad Steven Byrnes is doing what he’s doing, but notably he is in fact funded to do it! I don’t know how hard it was for him to get funding.
I don’t really feel excited for people being more encouraged to do deep learning theory or agent foundations. I do appreciate that people tried out agent foundations back in the day.
Oh, I didn’t mean to imply that getting funded at all for ambitious research wasn’t possible or was extremely difficult. The directions I mentioned above are all funded to some extent. Just that e.g. I expect Steven could get 10x+ financial compensation by joining a lab and doing less ambitious AI safety research (or, with less confidence, get more compensation and credit by starting a for-profit or even a non-profit doing less ambitious work). And that people making these sacrifices should be lauded.
I wish you had said ambitious AI safety research directions that you’re glad (at least ex ante) that people have pursued; it’s hard for me to otherwise know what I think of this.
I am glad ex ante about ARC’s work, which counts. I think mech interp was probably pretty bad for the AI safety field overall (because of opportunity cost of people who worked on it), and I think its vibe of ambitiousness probably made the research go substantially worse (e.g. I wish people had either had the pragmatic interpretability attitude earlier, or that they’d been much more careful when making claims about their progress on ambitious goals). I can’t think of ambitious research directions that I wish were more strongly incentivized by the AI safety community; I think LWers are overall too enthusiastic and credulous about novel ambitious research directions.
I agree with job incentives (because many jobs are at AI companies that end up pushing people to contribute to the problems that are currently important), but I don’t really agree re funding incentives.
There are two (overlapping) groups that I wanted to thank and incentivise which I bundled together in the OP. In hindsight I probably should have separated them out.
1. Those that pursue research because they think it’s valuable and are foregoing big bucks and external credibility. This could be ambitious or incremental work.
2. Those that aim to solve fundamental problems that would lead to step/changes in our ability to control/guide AI.
Point 1 is uncontroversial. 2 is cruxy to the extent that people disagree about the expected value safety-wise of ambitious and incremental work.
My guess is that I think E(ambitious) - E(incremental) is larger than you do. Miscellaneous grab bag of ambitious work that I think is or was high EV:
- some mech interp (fwiw I’d call work that e.g. applies SAEs to a new problem or tries to minorly improve them incremental research rather than ambitious.)
- deep learning theory
- agent foundations
- Steven Byrnes brain-like AGI stuff
Then yeah I also claimed that jobs and funding are biased against ambitious research. As you mentioned, the case for jobs is clear. RE funding I think it’s very hard to evaluate ambitious proposals that typically don’t have good short-term milestones. It at least seems like CG are actively trying to overcome that. More varied funding sources would help too.
I think I’m glad Steven Byrnes is doing what he’s doing, but notably he is in fact funded to do it! I don’t know how hard it was for him to get funding.
I don’t really feel excited for people being more encouraged to do deep learning theory or agent foundations. I do appreciate that people tried out agent foundations back in the day.
Oh, I didn’t mean to imply that getting funded at all for ambitious research wasn’t possible or was extremely difficult. The directions I mentioned above are all funded to some extent. Just that e.g. I expect Steven could get 10x+ financial compensation by joining a lab and doing less ambitious AI safety research (or, with less confidence, get more compensation and credit by starting a for-profit or even a non-profit doing less ambitious work). And that people making these sacrifices should be lauded.