My own thinking about this whole class of questions starts with: is the agent threatening this capable of torturing systems that I prefer (on reflection) not be tortured? If I’m confident they can do so, then they can credibly threaten me.
Among other things, this formulation lets me completely ignore whether Skynet’s simulation of me is actually me. That’s irrelevant to the question at hand. In fact, whether it’s even a simulation of me, and indeed whether it’s a person at all, is irrelevant. What’s important is whether I prefer it not be tortured.
A lot of ill-defined terms (“person”, “simulation”,”me”) thus drop out of my evaluation.
In principle I expect that a sufficiently capable intelligence can create systems that I prefer not be tortured, but I’d need quite a lot of evidence before I was actually confident that any given intelligence was capable of doing so.
That said, the problem of evidence is itself tricky here. I expect that it is much easier to build a system I don’t endorse caring about in the abstract, and then manipulate the setting so that I come to care about it anyway, than to build a system that I endorse caring about. That said, we can finesse the epistemic issue by asking a different question: is the intelligence capable of creating (and torturing) a system S such that, if I somehow became confident that S has the attributes S in fact has, I would prefer that S not be tortured?
My confidence that humans have this ability is low, though (as above) in principle I expect that a sufficiently capable intelligence can do so. Certainly I don’t have it, and I’ve never seen significant evidence that anyone else does.
My own thinking about this whole class of questions starts with: is the agent threatening this capable of torturing systems that I prefer (on reflection) not be tortured? If I’m confident they can do so, then they can credibly threaten me.
Among other things, this formulation lets me completely ignore whether Skynet’s simulation of me is actually me. That’s irrelevant to the question at hand. In fact, whether it’s even a simulation of me, and indeed whether it’s a person at all, is irrelevant. What’s important is whether I prefer it not be tortured.
A lot of ill-defined terms (“person”, “simulation”,”me”) thus drop out of my evaluation.
In principle I expect that a sufficiently capable intelligence can create systems that I prefer not be tortured, but I’d need quite a lot of evidence before I was actually confident that any given intelligence was capable of doing so.
That said, the problem of evidence is itself tricky here. I expect that it is much easier to build a system I don’t endorse caring about in the abstract, and then manipulate the setting so that I come to care about it anyway, than to build a system that I endorse caring about. That said, we can finesse the epistemic issue by asking a different question: is the intelligence capable of creating (and torturing) a system S such that, if I somehow became confident that S has the attributes S in fact has, I would prefer that S not be tortured?
My confidence that humans have this ability is low, though (as above) in principle I expect that a sufficiently capable intelligence can do so. Certainly I don’t have it, and I’ve never seen significant evidence that anyone else does.
Have you?