Note that although my views are much closer to Paul’s than to Pushmeet’s here, I’m posting this because I found it a useful summary of some ML perspectives and disagreements on AI safety, not because I’m endorsing the claims above.
Some disagreements that especially jumped out at me: I’d treat it as a negative update if I learned that AI progress across the board had sped up, and I wouldn’t agree with “even absent the actions of the longtermists, there’s a reasonably good chance that everything would just be totally fine”.
It seems to me if you’re someone who has done a PhD in ML or is very good at ML, but you currently can’t get a position that seems especially safety-focused or that is going to disproportionately affect safety more than capabilities, it is probably still good to take a job that just advances AI in general, mostly because you’ll be reaching the cutting edge potentially of what’s going on and improving your career capital a lot and having relevant understanding.
(The following is an off-the-cuff addition that occurred to me while reading this—it’s something I’ve thought about frequently, but it’s intended as something to chew on, not as an endorsement or disavowal of any specific recommendation by Rob W or Paul above.)
The cobbled-together model of Eliezer in my head wants to say something like: ‘In the Adequate World, the foremost thing in everyone’s heads is “I at least won’t destroy the world by my own hands”, because that’s the bare-minimum policy each individual would want everyone else to follow. This should probably also be in the back of everyone’s heads in the real world, at least as a weight on the scale and a thought that’s fine and not-arrogant to factor in.’
I feel pretty confident about “this is a line of thinking that’s reasonable and healthy to be able to entertain, alongside lots of other complicated case-by-case factors that all need to be weighed by each actor”, and then I don’t know how to translate that into concrete recommendations for arbitrary LW users.
Note that although my views are much closer to Paul’s than to Pushmeet’s here, I’m posting this because I found it a useful summary of some ML perspectives and disagreements on AI safety, not because I’m endorsing the claims above.
Some disagreements that especially jumped out at me: I’d treat it as a negative update if I learned that AI progress across the board had sped up, and I wouldn’t agree with “even absent the actions of the longtermists, there’s a reasonably good chance that everything would just be totally fine”.
(The following is an off-the-cuff addition that occurred to me while reading this—it’s something I’ve thought about frequently, but it’s intended as something to chew on, not as an endorsement or disavowal of any specific recommendation by Rob W or Paul above.)
The cobbled-together model of Eliezer in my head wants to say something like: ‘In the Adequate World, the foremost thing in everyone’s heads is “I at least won’t destroy the world by my own hands”, because that’s the bare-minimum policy each individual would want everyone else to follow. This should probably also be in the back of everyone’s heads in the real world, at least as a weight on the scale and a thought that’s fine and not-arrogant to factor in.’
I legitimately can’t tell: is that meant to be an argument against taking a job that advances AI in general?
No. This is maybe clearer given the parenthetical I edited in. Speaking for myself, Critch’s recommendations in https://www.lesswrong.com/posts/7uJnA3XDpTgemRH2c/critch-on-career-advice-for-junior-ai-x-risk-concerned seemed broadly reasonable to me, though I’m uncertain about those too and I don’t know of a ‘MIRI consensus view’ on Critch’s suggestions.
I feel pretty confident about “this is a line of thinking that’s reasonable and healthy to be able to entertain, alongside lots of other complicated case-by-case factors that all need to be weighed by each actor”, and then I don’t know how to translate that into concrete recommendations for arbitrary LW users.