Can I ask a few things that might clarify that last part?
Do you just not get somewhat impulsive or (non-clinically) intrusive thoughts when pondering a situation (any situation) where things go bad? Like, in cases where things are (partially) outside your control, where your mind tries to come up with myriad solutions to the problem regardless, some of which are less than tasteful?
Because I suspect part of what the journalists are asking you is “How do you not fall into a pit of depression?” but a plausible other part is “How do you not end up taking actions well outside the Overton window in an attempt to ‘make a desperate effort’?”
I think this is what the kind of people who accuse you of incentivizing violence are kind of thinking. Part of it is operating (as Zvi calls it) at the wrong simulacra levels, where statements about the state of reality are taken not literally but as a call to action. But I think part of it is “If I were convinced the barbarians were at our gates, I’d be fighting as many of them off as I can. But this man is convinced that the barbarians are in fact at our gates. Why isn’t he picking up a sword?”
If the answer is “I’m not convinced doing so would actually make an appreciable dent, and it would come at the cost of my soul,” then frankly that should probably be part of your answer to those journalists.
(Apologies if it already is, I haven’t kept up with all your recent media appearances.)
It cannot be answered that simply to the Earthlings, because if you answer “Because I don’t expect that to actually work or help”, some of them and especially the more evil ones will pounce in reply, “Aha, so you’re not replying, ‘I’d never do that because it would be wrong and against the law’, what a terrible person you must be!”
Not having a lot of experience with PR, it feels like you could still make it work if you emphasize the immortality first. If you can clarify the journalist’s question enough for them to say something like “Most other people would be doing something drastic or crazy or evil in this situation,” you could respond with something like:
“Okay, let’s say I decide to [do something drastic or evil]. I’d hate doing it and I’d immediately turn into a villain, but fine. What happens next? [Break down a likely scenario showing how it wouldn’t work.] So I’d turn myself evil, trigger a backlash against AI Safety, and we’d end up in a worse position than where I started. It’s not worth it.”
I don’t think doing so would even be dishonest. You’ve argued for people to be careful with utilitarianism, to take a half-step towards it and then stop because we’re running on corrupted hardware that makes it tempting to engage in motivated reasoning. This feels a lot like compensating for that.
Can I ask a few things that might clarify that last part?
Do you just not get somewhat impulsive or (non-clinically) intrusive thoughts when pondering a situation (any situation) where things go bad? Like, in cases where things are (partially) outside your control, where your mind tries to come up with myriad solutions to the problem regardless, some of which are less than tasteful?
Because I suspect part of what the journalists are asking you is “How do you not fall into a pit of depression?” but a plausible other part is “How do you not end up taking actions well outside the Overton window in an attempt to ‘make a desperate effort’?”
I think this is what the kind of people who accuse you of incentivizing violence are kind of thinking. Part of it is operating (as Zvi calls it) at the wrong simulacra levels, where statements about the state of reality are taken not literally but as a call to action. But I think part of it is “If I were convinced the barbarians were at our gates, I’d be fighting as many of them off as I can. But this man is convinced that the barbarians are in fact at our gates. Why isn’t he picking up a sword?”
If the answer is “I’m not convinced doing so would actually make an appreciable dent, and it would come at the cost of my soul,” then frankly that should probably be part of your answer to those journalists.
(Apologies if it already is, I haven’t kept up with all your recent media appearances.)
It cannot be answered that simply to the Earthlings, because if you answer “Because I don’t expect that to actually work or help”, some of them and especially the more evil ones will pounce in reply, “Aha, so you’re not replying, ‘I’d never do that because it would be wrong and against the law’, what a terrible person you must be!”
Not having a lot of experience with PR, it feels like you could still make it work if you emphasize the immortality first. If you can clarify the journalist’s question enough for them to say something like “Most other people would be doing something drastic or crazy or evil in this situation,” you could respond with something like:
“Okay, let’s say I decide to [do something drastic or evil]. I’d hate doing it and I’d immediately turn into a villain, but fine. What happens next? [Break down a likely scenario showing how it wouldn’t work.] So I’d turn myself evil, trigger a backlash against AI Safety, and we’d end up in a worse position than where I started. It’s not worth it.”
I don’t think doing so would even be dishonest. You’ve argued for people to be careful with utilitarianism, to take a half-step towards it and then stop because we’re running on corrupted hardware that makes it tempting to engage in motivated reasoning. This feels a lot like compensating for that.