Not having a lot of experience with PR, it feels like you could still make it work if you emphasize the immortality first. If you can clarify the journalist’s question enough for them to say something like “Most other people would be doing something drastic or crazy or evil in this situation,” you could respond with something like:
“Okay, let’s say I decide to [do something drastic or evil]. I’d hate doing it and I’d immediately turn into a villain, but fine. What happens next? [Break down a likely scenario showing how it wouldn’t work.] So I’d turn myself evil, trigger a backlash against AI Safety, and we’d end up in a worse position than where I started. It’s not worth it.”
I don’t think doing so would even be dishonest. You’ve argued for people to be careful with utilitarianism, to take a half-step towards it and then stop because we’re running on corrupted hardware that makes it tempting to engage in motivated reasoning. This feels a lot like compensating for that.
Not having a lot of experience with PR, it feels like you could still make it work if you emphasize the immortality first. If you can clarify the journalist’s question enough for them to say something like “Most other people would be doing something drastic or crazy or evil in this situation,” you could respond with something like:
“Okay, let’s say I decide to [do something drastic or evil]. I’d hate doing it and I’d immediately turn into a villain, but fine. What happens next? [Break down a likely scenario showing how it wouldn’t work.] So I’d turn myself evil, trigger a backlash against AI Safety, and we’d end up in a worse position than where I started. It’s not worth it.”
I don’t think doing so would even be dishonest. You’ve argued for people to be careful with utilitarianism, to take a half-step towards it and then stop because we’re running on corrupted hardware that makes it tempting to engage in motivated reasoning. This feels a lot like compensating for that.