It seems to me that you’re making a bucket error between “someone trying to murder is on the other side of a conflict”, and “someone trying to murder is going to persistently keep trying to murder even if they obtain the things they say they want right now, because they at-least-somewhat-terminally value murder”. Perhaps there are ways to say things in language that is conditionally sedate and does not attribute the behavior you want to stop to an identity feature that the accused person is likely to internalize? For example, if someone is told “by participating in [x common thing], you are a murderer”, they seem more likely to consider themselves morally licensed to do other things that are called murder by angry people online. The argument isn’t “they’re not doing something that should be stopped”, which is what I see you as responding to; it’s “try not to write to their identity slot or to others’ identity slot for that person, when accusing them of bad behavior”.
“Mass manslaughterer” seems more accurate than “mass murderer” anyway, and might be lower on the scale OP describes.
Pushing toward ASI isn’t actually a common thing, only a tiny fraction of people are doing that. I think it’s unlikely, but if my words cause someone to quit pushing toward ASI but feel morally licensed to do other bad things, I think I’d still consider that a win, since pushing toward ASI is one of the worst things a person can do. I think there are people out there who at-least-somewhat-terminally value murder, but are prevented from committing murder by moral disapprobation and threat of punishment, so it’s important not to push those things outside the Overton window for fear of causing bad vibes.
I’m more worried about the case where your words not only don’t stop them from pushing towards ASI, but make them feel that in their pursuit of ASI, since they are now intended by you to consider themselves a bad person for doing it, they should think of themselves as being intentionally evil, and take other intentionally evil actions; or, someone else who would have tried to get them to stop by talking to them, now thinks of them as impossible to talk to, and treats them as beyond reach of communication and request. There’s a space between “say they’re doing a bad thing” and “say they’re inherently a kind of person who is inclined to do bad things”, or “say they’re impossible to pressure via ordinary means and must be pressured via unusual means”. I am not asking you to say what they’re doing is fine, and I would understand if the split that those-who-agree-with-OP are asking you to make wasn’t a split you were previously treating as notable.
Also, for the record, I’d volunteer my time to talk with anyone who is currently doing capabilities research at an AI research group, or who is seriously considering doing that, and try to explain why they shouldn’t but in a way that is kind and open and understanding. (I don’t think I have a legible track record of doing this, but I would unaccountably claim that there’s a significant chance I’d be good at this for a substantial subset of such people.)
Also, for the record, I’d volunteer my time to talk with anyone who is currently doing capabilities research at an AI research group, or who is seriously considering doing that, and try to explain why they shouldn’t but in a way that is kind and open and understanding. (I don’t think I have a legible track record of doing this, but I would unaccountably claim that there’s a significant chance I’d be good at this for a substantial subset of such people.)
There’s this one Upton Sinclair quote I think about a lot in this context. I imagine you’ve seen it?
Chewing it over more, I think you may have neglected to consider Newcomblike self-deception as a possible factor in Sinclair’s razor. It’s not necessary for the person to be lying about what they believe, or for them to have consciously convinced themselves of that lie. They can just have a big convenient cognitive blind spot.
Well, I meant that to be included under “has really convinced themselves”, where you’re proposing that they could have convinced themselves unconsciously. (Which I agree happens, via a bunch of little ugh fields and piecemeal distorted-world construction.)
Feel free to make an edit to clarify though, it’s a wiki!
It seems to me that you’re making a bucket error between “someone trying to murder is on the other side of a conflict”, and “someone trying to murder is going to persistently keep trying to murder even if they obtain the things they say they want right now, because they at-least-somewhat-terminally value murder”. Perhaps there are ways to say things in language that is conditionally sedate and does not attribute the behavior you want to stop to an identity feature that the accused person is likely to internalize? For example, if someone is told “by participating in [x common thing], you are a murderer”, they seem more likely to consider themselves morally licensed to do other things that are called murder by angry people online. The argument isn’t “they’re not doing something that should be stopped”, which is what I see you as responding to; it’s “try not to write to their identity slot or to others’ identity slot for that person, when accusing them of bad behavior”.
“Mass manslaughterer” seems more accurate than “mass murderer” anyway, and might be lower on the scale OP describes.
Pushing toward ASI isn’t actually a common thing, only a tiny fraction of people are doing that. I think it’s unlikely, but if my words cause someone to quit pushing toward ASI but feel morally licensed to do other bad things, I think I’d still consider that a win, since pushing toward ASI is one of the worst things a person can do. I think there are people out there who at-least-somewhat-terminally value murder, but are prevented from committing murder by moral disapprobation and threat of punishment, so it’s important not to push those things outside the Overton window for fear of causing bad vibes.
I’m more worried about the case where your words not only don’t stop them from pushing towards ASI, but make them feel that in their pursuit of ASI, since they are now intended by you to consider themselves a bad person for doing it, they should think of themselves as being intentionally evil, and take other intentionally evil actions; or, someone else who would have tried to get them to stop by talking to them, now thinks of them as impossible to talk to, and treats them as beyond reach of communication and request. There’s a space between “say they’re doing a bad thing” and “say they’re inherently a kind of person who is inclined to do bad things”, or “say they’re impossible to pressure via ordinary means and must be pressured via unusual means”. I am not asking you to say what they’re doing is fine, and I would understand if the split that those-who-agree-with-OP are asking you to make wasn’t a split you were previously treating as notable.
Agreed. Cf. https://www.lesswrong.com/posts/CYTwRZtrhHuYf7QYu/a-case-for-courage-when-speaking-of-ai-danger?commentId=pLH6dxnTrTz56BQYj
Also, for the record, I’d volunteer my time to talk with anyone who is currently doing capabilities research at an AI research group, or who is seriously considering doing that, and try to explain why they shouldn’t but in a way that is kind and open and understanding. (I don’t think I have a legible track record of doing this, but I would unaccountably claim that there’s a significant chance I’d be good at this for a substantial subset of such people.)
There’s this one Upton Sinclair quote I think about a lot in this context. I imagine you’ve seen it?
I wrote the wiki entry on it :) https://www.lesswrong.com/w/sinclair-s-razor
Chewing it over more, I think you may have neglected to consider Newcomblike self-deception as a possible factor in Sinclair’s razor. It’s not necessary for the person to be lying about what they believe, or for them to have consciously convinced themselves of that lie. They can just have a big convenient cognitive blind spot.
https://www.lesswrong.com/posts/Ht4JZtxngKwuQ7cDC/tsvibt-s-shortform?commentId=WsrmFhuysbmJ3xiPm
Well, I meant that to be included under “has really convinced themselves”, where you’re proposing that they could have convinced themselves unconsciously. (Which I agree happens, via a bunch of little ugh fields and piecemeal distorted-world construction.)
Feel free to make an edit to clarify though, it’s a wiki!