wedrifid comments on In favour of a selective CEV initial dynamic

wedrifid 31 Oct 2011 13:09 UTC
0 points
0

As most possible minds don’t care about humans, I object to using “unfriendly” to mean “an AI that would result in a world that I don’t value.” I think it better to use “unfriendly” to mean those minds indifferent to humans and the few hateful ones. Those that have value according to many but not all, such as perhaps those that seriously threaten to torture people, but only when they know those threatened will buckle, are better thought of as being a subspecies of Friendly AI.

I disagree. I will never refer to anything that wants to kill or torture me as friendly. Because that would be insane. AIs that are friendly to certain other people but not to me are instances of uFAIs in the same way that paperclippers are uFAIs (that are Friendly to paperclips). I incidentally also reject FAI and FAI. Although in the latter case I would still choose it as an alternative to nothing (which likely defaults to extinction).

Mind you the nomenclature isn’t really sufficient to the task either way. I prefer to make my meaning clear of ambiguities. So if talking about “Friendly” AI that will kill me I tend to use the quotes that I just used while if I am talking about something that is Friendly to a specific group I’ll parameterize.
- lessdazed 31 Oct 2011 16:21 UTC
  0 points
  0
  Parent
  
  I will never refer to anything that wants to kill or torture me as friendly
  
  OK—this is included under what I would suggest to call “Friendly”, certainly if it only wanted to do so instrumentally, so we have a genuine disagreement. This is a good example for you to raise, as most even here might agree with how you put that.
  
  Nonetheless, my example is not included under this, so let’s be sure not to talk past each other. It was intended to be a moderate case, one in which you might not call something friendly when many others here would* - one in which a being wouldn’t desire to torture you, and would be bluffing if only in the sense that it had scrupulously avoided possible futures in which anyone would be tortured, if not in other senses (i.e. it actually would torture you, if you chose the way you won’t).
  
  As for not killing you, that sounds like an obviously badly phrased genie wish. As a similar point to the one you expressed would be reasonable and fully contrast with mine I’m surprised you added that.
  
  One can go either way (or other or both ways) on this labeling. I am apparently buying into the mind-projection fallacy and trying to use “Friendly” the way terms like “funny” or “wrong” are regularly used in English. If every human but me “finds something funny”, it’s often least confusing to say it’s “a funny thing that isn’t funny to me” or “something everyone else considers wrong that I don’t consider “wrong” (according to the simplest way of dividing concept-space) that is also advantageous for me”. You favor taking this new term and avoiding using the MPF, unlike for other English terms, and having it be understood that listeners are never to infer meaning as if the speaker was committing it, I favor just using it like any other term.
  
  So:
  
  Mind you the nomenclature isn’t really sufficient to the task either way
  
  My way, a being that wanted to do well by some humans and not others would be objectively both Friendly and Unfriendly, so that might be enough to make my usage inferior. But if my molecules are made out of usefulonium, and no one else’s are, I very much mind a being exploiting me for that, but wouldn’t mind other humans calling that being friendly when it uses the usefulonium to shield the Earth from a supernova, or whatever—and it’s not just not minding by comparison, either.
  
  *I mean both when others refer to beings making analogous threats to them and to the one that would make them to you.