I spent a bit of time (like, 10 min) thinking through warning shots today.
I definitely do not think anyone should take any actions that specifically cause warning shots to happen (if you are trying to do something like that, you should be looking for “a scary demo”, not “a warning shot”. Scary demos can/should be demo’d ethically)
If you know of a concrete safety intervention that’d save lives, obviously do the safety intervention.
But, a lot of the questions here are less like “should I do this intervention?” and more like “should I invest years of my life researching into a direction that helps found a new subfield that maybe will result in concrete useful things that save some lives locally but also I expect to paper over problems and cost more lives later?” (when, meanwhile, there are tons of other research directions you could explore)
...yes there is something sus here that I am still confused about, but, with the amount of cluelessness that necessarily involves I don’t think people have an obligation to go founding new research subfields if their current overall guess is “useful locally but harmful globally.”
I think if you go and try to suppress research into things that you think are moderately likely to save some lives a few years down the line but cost more live later, then we’re back into ethically fraught territory (but, like, also, you shouldn’t suppress people saying “guys this research line is maybe on net going to increase odds of everyone dying.”
I didn’t actually get to having a new crystallized take, that was all basically my background thoughts from earlier.
(Also, hopefully obviously: when you are deciding your research path, or arguing people should abandon one, you do have to actually do the work to make an informed argument for whether/how bad any of the effects are, ‘it’s plausible X might lead to a warning shot that helps’ or ‘it’s plausible Y might lead to helping on net with alignment subproblems’ or ‘Y might save a moderate number of lives’ are all things you need to unpack and actually reason through)
I spent a bit of time (like, 10 min) thinking through warning shots today.
I definitely do not think anyone should take any actions that specifically cause warning shots to happen (if you are trying to do something like that, you should be looking for “a scary demo”, not “a warning shot”. Scary demos can/should be demo’d ethically)
If you know of a concrete safety intervention that’d save lives, obviously do the safety intervention.
But, a lot of the questions here are less like “should I do this intervention?” and more like “should I invest years of my life researching into a direction that helps found a new subfield that maybe will result in concrete useful things that save some lives locally but also I expect to paper over problems and cost more lives later?” (when, meanwhile, there are tons of other research directions you could explore)
...yes there is something sus here that I am still confused about, but, with the amount of cluelessness that necessarily involves I don’t think people have an obligation to go founding new research subfields if their current overall guess is “useful locally but harmful globally.”
I think if you go and try to suppress research into things that you think are moderately likely to save some lives a few years down the line but cost more live later, then we’re back into ethically fraught territory (but, like, also, you shouldn’t suppress people saying “guys this research line is maybe on net going to increase odds of everyone dying.”
I didn’t actually get to having a new crystallized take, that was all basically my background thoughts from earlier.
(Also, hopefully obviously: when you are deciding your research path, or arguing people should abandon one, you do have to actually do the work to make an informed argument for whether/how bad any of the effects are, ‘it’s plausible X might lead to a warning shot that helps’ or ‘it’s plausible Y might lead to helping on net with alignment subproblems’ or ‘Y might save a moderate number of lives’ are all things you need to unpack and actually reason through)