gwern comments on Open Thread, April 15-30, 2013

gwern 22 Apr 2013 1:24 UTC
2 points
0

If you have a maximiser of A, the ability to constrain that maximiser, and the ability to generate A, you can use it to maximise B by rewarding the production of B with A. If A = entropy and B = utility, Q.E.D.

That seems to simply be buck-passing. What does this gain us over simply maximizing B? If we can compute how to maximize a predicate like A, then what stops us from maximizing B directly?

If you know go, that’s pretty similar to winning.

Pretty similar, yet somehow, crucially, not the same thing. If you know go, consider a board position in which 51% of the board has been filled with your giant false eye, you move, and there is 1 move which turns it into a true eye and many moves which don’t. The winning-maximizing move is to turn your false eye into a true eye, yet this shuts down a huge tree of possible futures in which your false eye is killed, thousands of stones are removed from the board, and you can replay the opening with its beyond-astronomical number of possible futures...
- timtyler 22 Apr 2013 1:48 UTC
  0 points
  0
  Parent
  
  If you have a maximiser of A, the ability to constrain that maximiser, and the ability to generate A, you can use it to maximise B by rewarding the production of B with A. If A = entropy and B = utility, Q.E.D.
  
  That seems to simply be buck-passing. What does this gain us over simply maximizing B? If we can compute how to maximize a predicate like A, then what stops us from maximizing B directly?
  
  You said you didn’t see how having an entropy maximizer would help with maximizing utility. Having an entropy maximizer would help a lot. Basically maximizers are very useful things—almost irrespective of what they maximize.
  
  If you know go, that’s pretty similar to winning.
  
  Pretty similar, yet somehow, crucially, not the same thing. [...]
  
  Sure. I never claimed they were the same thing.
  
  If you forbid passing, forbid suicide and aim to mimimize your opponent’s possible moves, that would make a lot more sense—as a short description of a go-playing strategy.
  - gwern 22 Apr 2013 2:00 UTC
    6 points
    0
    Parent
    
    You said you didn’t see how having an entropy maximizer would help with maximizing utility. Having an entropy maximizer would help a lot. Basically maximizers are very useful things—almost irrespective of what they maximize.
    
    So maximizers are useful for maximizing? That’s good to know.
    - timtyler 22 Apr 2013 10:49 UTC
      −2 points
      0
      Parent
      That’s trivializing the issue. The idea is that maximisers can often be repurposed to help other agents (via trade, slavery etc).
      
      It sounds as though you originally meant to ask a different question. You can now see how maximizing entropy would be useful, but want to know what advantages it has over other approaches.
      
      The main advantage I am aware of associated with maximizing entropy is one of efficiency. If you maximize something else (say carbon atoms), you try and leave something behind. By contrast, an entropy maximizer would use carbon atoms as fuel. In a competition, the entropy maximizer would come out on top—all else being equal.
      
      It’s also a pure and abstract type of maximisation that mirrors what happens in natural systems. Maybe it has been studied more.
      - gwern 22 Apr 2013 16:37 UTC
        0 points
        0
        Parent
        
        It sounds as though you originally meant to ask a different question. You can now see how maximizing entropy would be useful,
        
        I already saw how it could be useful in a handful of limited situations—that’s why I brought up the Go example in the first place!
        
        but want to know what advantages it has over other approaches.
        
        As it stands, it sounds like a limited heuristic and the claims about intelligence grossly exaggerated.