FeepingCreature comments on The Commitment Races problem

FeepingCreature 30 Dec 2019 3:00 UTC
1 point
0
I think this undervalues conditional commitments. The problem of “early commitment” depends entirely on you possibly having a wrong image of the state of the world. So if you just condition your commitment on the information you have available, you avoid premature commitments made in ignorance and give other agents an incentive to improve your world model. Likewise, this would protect you from learning about other agents’ commitments “too late”—you can always just condition on things like “unless I find an agent with commitment X”. You can do this whether or not you even know to think of an agent with commitment X, as long as other agents who care about X can predict your reaction to learning about X.

Commitments aren’t inescapable shackles, they’re just another term for “predictable behavior.” The usefulness of commitments doesn’t require you to bind yourself regardless of learning any new information about reality. Oaths are highly binding for humans because we “look for excuses”, our behavior is hard to predict, and we can’t reliably predict and evaluate complex rule systems. None of those should pose serious problems for trading superintelligences.
- Daniel Kokotajlo 3 Jan 2020 14:29 UTC
  5 points
  0
  Parent
  I don’t think this solves the problem, though it is an important part of the picture.
  The problem is, which conditional commitments do you make? (A conditional commitment is just a special case of a commitment) “I’ll retaliate against A by doing B, unless [insert list of exceptions here.” Thinking of appropriate exceptions is important mental work, and you might not think of all the right ones for a very long time, and moreover while you are thinking about which exceptions you should add, you might accidentally realize that such-and-such type of agent will threaten you regardless of what you commit to and then if you are a coward you will “give in” by making an exception for that agent. The problem persists, in more or less exactly the same form, in this new world of conditional commitments. (Again, which are just special cases of commitments, I think.)
  - FeepingCreature 4 Jan 2020 2:05 UTC
    0 points
    0
    Parent
    I concur in general, but:
    
    you might accidentally realize that such-and-such type of agent will threaten you regardless of what you commit to and then if you are a coward you will “give in” by making an exception for that agent.
    
    this seems like a problem for humans and badly-built AIs. Nothing that reliably one-boxes should ever do this.
    - Daniel Kokotajlo 4 Jan 2020 16:38 UTC
      3 points
      0
      Parent
      EDT reliably one-boxes, but EDT would do this.
      Or do you mean one-boxing in Transparent Newcomb? Then your claim might be true, but even then it depends on how seriously we take the “regardless of what you commit to” clause.
      - FeepingCreature 5 Jan 2020 15:38 UTC
        1 point
        0
        Parent
        True, sorry, I forgot the whole set of paradoxes that led up to FDT/UDT. I mean something like… “this is equivalent to the problem that FDT/UDT already has to solve anyways.” Allowing you to make exceptions doesn’t make your job harder.