Eliezer’s example is set up in such a way that, regardless of what the paperclip maximizer does, defecting gains one billion lives and loses two paperclips.
This same claim can be made about the standard prisoner’s dilemma. In the standard version, I still cooperate because, even if this challenge won’t be repeated, it’s embedded in a social context for me in which many interactions are solo, but part of the social fabric. (tipping, giving directions to strangers, items left behind in a cafe are examples. I cooperate even though I expect not to see the same person again.) What is it about the social context that makes this so?
I don’t fall back on an assumption that the other reasons the same as me. It could as easily be a psychopath, according to the standards of the universe it comes from. Making the assumption leaves you open to exploitation. But if there are reasons for the other to have habits that are formed by similar forces, then concluding that cooperation is the more likely behavior to be trained by its environment is a valuable result.
The question, for me, is what kind of social context does the other inhabit. The paperclip maximizer might be the only (or the most powerful) inhabitant of its universe, but that seems less likely than that it is embedded in some social context, and has to make trade-offs in interactions with others in order to get what it wants. It’s hard for me to imagine a universe that would produce one powerful agent above all others. (Even though I’ve heard the argument in just the kind of discussion of SIs that raises the questions of friendliness and paperclip maximizers.)
[Sorry Allan, that you won’t be able to reply. But you did raise the question before bowing out...]