garabik comments on Harry Potter and the Methods of Rationality discussion thread, February 2015, chapter 113

garabik 1 Mar 2015 9:41 UTC
5 points
Thinking about AI boxing—note that it is Harry who represents humanity, his core values and goals were not changed that much by the Vow, they were just formalized.

It is LV that has goals that are mostly what we’d agree about (`ensure the continuous existence of the world’), but he has very different values and no moral constraints. In short, dealing with him is like dealing with an Unfriendly AI or an Alien mind (like Sorting Hat).

So this is more like a clash between Unfriendly (or better, Indifferent) and a Friendly AI, where the goals are more or less compatible, but in addition the FAI keeps human values. And the UFAI got there first and is more powerful.

The rational way if your goals are compatible is to cooperate—however, Harry’s values almost ensure that he will defect given the chance. And LV knows it, so the rational action for him is to defect (=kill) as well.
- TobyBartels 3 Mar 2015 3:06 UTC
  1 point
  Parent
  
  Unfriendly (or better, Indifferent)
  
  Same thing.
  
  ‘The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else.’