DanielLC comments on Satisficers want to become maximisers

DanielLC 22 Oct 2011 2:08 UTC
7 points
0
Alternately, a satisficer could build a maximiser. For example, if you don’t give it the ability to modify its own code. It also might build a paperclip-making Von Neumann machine that isn’t anywhere near a maximizer, but is still insanely dangerous.

I notice a satisficing agent isn’t well-defined. What happens when it has two ways of satisfying its goals? It may be possible to make a safe one if you come up with a good enough answer to that question.
- timtyler 22 Oct 2011 12:36 UTC
  3 points
  0
  Parent
  
  I notice a satisficing agent isn’t well-defined.
  
  What I usually mean by it is: maximise until some specified criterion is satisfied—and then stop.
  
  However, perhaps “satisficing” is not quite the right word for this. IMO, agents that stop are an important class of agents. I think we need a name for them—and this is one of the nearest things. In my essay, I called them “Stopping superintelligences”.
  
  What happens when it has two ways of satisfying its goals?
  
  That’s the same as with a maximiser.
  - Stuart_Armstrong 22 Oct 2011 12:54 UTC
    2 points
    0
    Parent
    
    What happens when it has two ways of satisfying its goals?
    
    That’s the same as with a maximiser.
    
    Except much more likely to come up; a maximiser facing many exactly balanced strategies in the real world is a rare occurance.
    - timtyler 22 Oct 2011 19:16 UTC
      1 point
      0
      Parent
      Well, usually you want satisfaction rapidly—and then things are very similar again.
      - DanielLC 22 Oct 2011 21:56 UTC
        1 point
        0
        Parent
        Then state that. It’s an inverse-of-time-until-satisfaction-is-complete maximiser.
        
        The way you defined satisfaction doesn’t really work with that. The satisficer might just decide that it has a 90% chance of producing 10 paperclips, and thus its goal is complete. There is some chance of it failing in its goal later on, but this is likely to be made up by the fact that it probably will satisfy its goals with some extra. Especially if it could self-modify.
- Stuart_Armstrong 22 Oct 2011 8:34 UTC
  1 point
  0
  Parent
  
  Alternately, a satisficer could build a maximiser.
  
  Yep. Coding “don’t unleash (or become) a maximiser or something similar” is very tricky.
  
  I notice a satisficing agent isn’t well-defined. What happens when it has two ways of satisfying its goals? It may be possible to make a safe one if you come up with a good enough answer to that question.
  
  It may be. But encoding “safe” for a satisficer sounds like it’s probably just as hard as constructing a safe utility function in the first place.