gwern comments on Ethical dilemmas for paperclip maximizers

gwern 1 Aug 2011 6:48 UTC
6 points
0
Can’t say it seems very fun to me; Clippy’s utility function is underdefined and not accessible to us anyway. We can debate the details for human utility functions because we have all sorts of shared intuitions which let us go into details, but how do we decide longevity of paperclips is better than number of paperclips? I have no intuitions for clippys.
- cousin_it 1 Aug 2011 9:44 UTC
  8 points
  0
  Parent
  It’s still conceivable that, even given all our shared intuitions, our “utility function” is just as underdefined as Clippy’s.
  - wedrifid 1 Aug 2011 9:52 UTC
    7 points
    0
    Parent
    
    It’s still conceivable that, even given all our shared intuitions, our “utility function” is just as underdefined as Clippy’s.
    
    I would have said far more so.
  - jhuffman 1 Aug 2011 20:49 UTC
    1 point
    0
    Parent
    I wonder what Clippy would infer about our utility functions.
    - Clippy 2 Aug 2011 15:11 UTC
      5 points
      0
      Parent
      That they’re stupid and reflectively inconsistent.
    - cousin_it 2 Aug 2011 13:49 UTC
      2 points
      0
      Parent
      Thanks for your comment! First I was like, “Clippy wouldn’t formalize humans as having utility functions”, then I was like “in that case why do we want to formalize our utility functions?”, and then I was all “because we have moral intuitions saying we should follow utility functions!” It’s funny how the whole house of cards comes tumbling down.
- [deleted] 1 Aug 2011 14:57 UTC
  3 points
  0
  Parent
  I want to note I had a different experience. All of the paperclip maximizer ethical problems seemed similar to a human ethical problems, so I did not experience that I had no intuition for Clippies.
  
  1: This seems similar to the Mere addition paradox. http://en.wikipedia.org/wiki/Mere_addition_paradox.
  
  2: This seems similar to the Robin Hanson space or time civilization question. http://www.overcomingbias.com/2011/06/space-v-time-allies.html
  
  3: This seems similar to the problem of given a finite number as a maximum population, is it better to have the population be immortal, or to have the oldest die and the new younger ones take their place.
  
  4: This seems similar to the problem of whether there are circumstances where it’s important to sacrifice a single person for the good of the many.
  
  Are these just problems that apply to most self reproducing patterns, regardless of what they happen to be called?
  
  I do also want to note, that the paperclip maximizer doesn’t begin as a self reproducing pattern, but it doesn’t seem like it would go very far if it didn’t build more paperclip maximizers in addition to building more paperclips. And it would probably want to have it’s own form have some value as well, or it might self destruct into paperclips, which means it would be a paperclip, since that is explicitly the only thing it values, which seems to mean it is very likely it resolves into building copies of itself.
  - MixedNuts 1 Aug 2011 15:22 UTC
    6 points
    0
    Parent
    Pattern-matching reasoning error ” must be an explicit goal, because otherwise it won’t do it, but it needs to in order to reach its goal ”. It needs only know copies help make paperclips to have “make copies” as an instrumental goal, and it doesn’t start valuing copies for themselves—if a copy becomes inefficient, disassemble it to make paperclips. You sometimes need to open car doors to go to the store, but you don’t wax poetic about the inherent value of opening car doors.
    - [deleted] 1 Aug 2011 17:59 UTC
      0 points
      0
      Parent
      Let me try removing the word “value” and rewording this a little.
      
      The paperclip maximizer doesn’t begin as a self reproducing pattern, but it doesn’t seem like it would go very far if it didn’t build more paperclip maximizers in addition to building more paperclips. And it would probably want to have it’s own copies be maximized as well, or it might self destruct into paperclips. This means it would have to consider itself a form of paperclip, since that is explicitly the only thing it maximizes for, since it isn’t a [paperclip and paperclip maximizer] maximizer which seems to mean it is very likely it resolves into building copies of itself.
      
      Does that rephrase fix the problems in my earlier post?
      - MixedNuts 2 Aug 2011 6:14 UTC
        3 points
        0
        Parent
        
        And it would probably want to have it’s own copies be maximized as well [...] This means it would have to consider itself a form of paperclip
        
        That’s the problematic step. If maximizing copies of itself if what maximizes paperclips, it happens automatically. It doesn’t have to decide “paperclips” stands for “paperclips and the 837 things I’ve found maximize them”. It notices “making copies leads to more paperclips than self-destructing into paperclips”, and moves on. Like you’re not afraid that, if you don’t believe growing cocoa beans is inherently virtuous, you might try to disassemble farms and build chocolate from their atoms.
        [deleted] 2 Aug 2011 10:40 UTC
        0 points
        0
        Parent
        I think I see what you’re getting at. It’s more in the vein of solving a logic/physics problem at that point. The only reason it would make the consideration I referred to would be if by making that consideration, it could make more paperclips, so it would come down to which type of replication code allowed for less effort to be spent on maximizers and more effort to be spent on paperclips over the time period considered.
  - gwern 1 Aug 2011 22:07 UTC
    0 points
    0
    Parent
    My problem with this is easily summed up: that makes sense, if you simply transform the Clippy problem into human problems, by replacing ‘paperclip’ with ‘human’. I don’t even know how Clippy problems map onto human problems, so I can’t smuggle my intuitions the other way into the Clippy problems (assuming the mapping is even bijective).
    - [deleted] 2 Aug 2011 10:54 UTC
      0 points
      0
      Parent
      That’s why I was trying to consider both the Clippy problems and the Human problems as a self replicating pattern problems. My human intuitions on Clippy problems might be flawed (since it isn’t human), but my self replicating pattern intuitions on Clippy problems shouldn’t have that same problem and I think they would map a lot better.
- Clippy 2 Aug 2011 15:17 UTC
  2 points
  0
  Parent
  Note: “clippy”, as a shortened term for “paperclip maximiser”, should be uncapitalized, and should be pluralised as “clippys”.
  - gwern 2 Aug 2011 15:23 UTC
    1 point
    0
    Parent
    Duly noted.