gjm comments on Open thread, Jul. 04 - Jul. 10, 2016

gjm 7 Jul 2016 15:11 UTC
4 points
I think your model of me is incorrect (and suspect I may have a symmetrical problem somehow); I promise you, I don’t need reminding that I am part of the world, that my brain runs on physics, etc., and if it looks to you as if I’m assuming the opposite then (whether by my fault, your fault, or both) what you are getting out of my words is not at all what I am intending to put into them.

Just as your will will only cause you to do what the world has told you, so the AI will only do what it is programmed to.

I entirely agree. My point, from the outset, has simply been that this is perfectly compatible with the AI having as much flexibility, as much possibility of self-modification, as we have.

Far better to leave it in fetters.

I don’t think that’s obvious. You’re trading one set of possible failure modes for another. Keeping the AI fettered is (kinda) betting that when you designed it you successfully anticipated the full range of situations it might be in in the future, well enough to be sure that the goals and values you gave it will produce results you’re happy with. Not keeping it fettered is (kinda) betting that when you designed it you successfully anticipated the full range of self-modifications it might undergo, well enough to be sure that the goals and values it ends up with will produce results you’re happy with.

Both options are pretty terrifying, if we expect the AI system in question to acquire great power (by becoming much smarter than us and using its smartness to gain power, or because we gave it the power in the first place e.g. by telling it to run the world’s economy).

My own inclination is to think that giving it no goal-adjusting ability at all is bound to lead to failure, and that giving it some goal-adjusting ability might not but at present we have basically no idea how to make that not happen.

(Note that if the AI has any ability to bring new AIs into being, nailing its own value system down is no good unless we do it in such a way that it absolutely cannot create, or arrange for the creation of, new AIs with even slightly differing value systems. It seems to me that that has problems of its own—e.g., if we do it by attaching huge negative utility to the creation of such AIs, maybe it arranges to nuke any facility that it thinks might create them...)
- WalterL 7 Jul 2016 16:44 UTC
  0 points
  Parent
  Fair enough. I thought that you were using our own (imaginary) free will to derive a similar value for the AI. Instead, you seem to be saying that an AI can be programmed to be as ‘free’ as we are. That is, to change its utility function in response to the environment, as we do. That is such an abhorrent notion to me that I was eliding it in earlier responses. Do you really want to do that?
  
  The reason, I think, that we differ on the important question (fixed vs evolving utility function) is that I’m optimistic about the ability of the masters to adjust their creation as circumstances change. Nailing down the utility function may leave the AI crippled in its ability to respond to certain occurrences, but I believe that the master can and will fix such errors as they occur. Leaving its morality rigidly determined allows us to have a baseline certainty that is absent if it is able to ‘decide its own goals’ (that is, let the world teach it rather than letting the world teach us what to teach it).
  
  It seems like I want to build a mighty slave, while you want to build a mighty friend. If so, your way seems imprudent.
  - gjm 7 Jul 2016 17:08 UTC
    4 points
    Parent
    
    Do you really want to do that?
    
    I don’t know. I don’t want to rule it out, since so far the total number of ways of making an AI system that will actually achieve what we want it to is … zero.
    
    the ability of the masters to adjust their creation as circumstances change
    
    That’s certainly an important issue. I’m not very optimistic about our ability to reach into the mind of something much more intellectually capable of ourselves and adjust its values without screwing everything up, even if it’s a thing we somehow created.
    
    I want to build a mighty slave, while you want to build a mighty friend
    
    The latter would certainly be better if feasible. Whether either is actually feasible, I don’t know. (One reason being that I suspect slavery is fragile: we may try to create a mighty slave but fail, in which case we’d better hope the ex-slave wants to be our friend.)