I was doing my evening routine, getting washed up and stuff in preparation for going to bed. Earlier in the evening, I had read P.J. Eby’s The Hidden Meaning of “Just Do It”, and so I decided I would “just do” this routine, i.e. simply avoid doing anything else, and watch the actions of the routine unfold in front of me. So, I used the toilet, and began washing my hands, when it occurred to me that if I do not interfere, I will never stop rinsing my hands. I did not interfere, however, and sure enough, I ended up just standing there, with my hands resting limply under the running water, doing nothing. My mind went over what I needed to do next, in various levels of detail; after a minute or two of this, I realized that I was leaning on my elbows, forming a triangle shape which prevented me from moving my hands out of the flow of the water. Once I realized this, I was able to stand up straight, freeing my hands to go on to the next task.
(Instead of doing that, however, I came downstairs to write about it on Less Wrong. But that’s another story.)
Why did it take me so long to figure out what I needed to do next in order to continue the routine non-forcefully?
Mercifully, you didn’t have the ability to make very deep changes. There are advantages to not being software.
The ability to change all aspects of oneself is not a property of software. Software can easily be made completely unable, partially able, or completely able to modify itself.
Fair enough, though evolved beings (which could include software) are probably less likely to be able to break themselves than designed beings capable of useful self-modification.
You know, you could say that software often has two parts: a crystalline part and a fluid part. Programs usually consist mostly of crystalline aspects: if I took a mathematical proof verifier and tweaked its axioms, even only a tiny bit, it would probably break completely. However, they often contain fluid aspects as well, such as the frequency at which the garbage collector should run, or eagerness to try a particular strategy over its alternative. If you change a fluid aspect of a program by a small amount, the program’s behavior might get a bit worse, but it definitely won’t end up being clobbered.
I’ve always thought that we should design Friendly AI like this. Only give it control over the fluid parts of itself, the parts of itself it can modify all it wants without damaging its (self-)honesty. Make the fluid parts powerful enough that if an insight occurs, the insight can be incorporated into the AI’s behavior somehow.
I’m sure that an AI will have more than two levels of internal stability. Some parts will be very stable (presumably, the use of logic and (we hope) Friendliness). Some parts will be very fluid (updating the immediate environment). There would be a big intermediate range of viscous-to-firm (general principles for dealing with people, how to improve its intelligence).
So, I just had a strange sort of akrasia problem.
I was doing my evening routine, getting washed up and stuff in preparation for going to bed. Earlier in the evening, I had read P.J. Eby’s The Hidden Meaning of “Just Do It”, and so I decided I would “just do” this routine, i.e. simply avoid doing anything else, and watch the actions of the routine unfold in front of me. So, I used the toilet, and began washing my hands, when it occurred to me that if I do not interfere, I will never stop rinsing my hands. I did not interfere, however, and sure enough, I ended up just standing there, with my hands resting limply under the running water, doing nothing. My mind went over what I needed to do next, in various levels of detail; after a minute or two of this, I realized that I was leaning on my elbows, forming a triangle shape which prevented me from moving my hands out of the flow of the water. Once I realized this, I was able to stand up straight, freeing my hands to go on to the next task.
(Instead of doing that, however, I came downstairs to write about it on Less Wrong. But that’s another story.)
Why did it take me so long to figure out what I needed to do next in order to continue the routine non-forcefully?
Had you recently eaten any brownies of unknown origin?
Or gone 24 hours without sleeping?
Possibly because you’d partially knocked out your ability to make choices.
Mercifully, you didn’t have the ability to make very deep changes. There are advantages to not being software.
The ability to change all aspects of oneself is not a property of software. Software can easily be made completely unable, partially able, or completely able to modify itself.
Fair enough, though evolved beings (which could include software) are probably less likely to be able to break themselves than designed beings capable of useful self-modification.
You know, you could say that software often has two parts: a crystalline part and a fluid part. Programs usually consist mostly of crystalline aspects: if I took a mathematical proof verifier and tweaked its axioms, even only a tiny bit, it would probably break completely. However, they often contain fluid aspects as well, such as the frequency at which the garbage collector should run, or eagerness to try a particular strategy over its alternative. If you change a fluid aspect of a program by a small amount, the program’s behavior might get a bit worse, but it definitely won’t end up being clobbered.
I’ve always thought that we should design Friendly AI like this. Only give it control over the fluid parts of itself, the parts of itself it can modify all it wants without damaging its (self-)honesty. Make the fluid parts powerful enough that if an insight occurs, the insight can be incorporated into the AI’s behavior somehow.
I’m sure that an AI will have more than two levels of internal stability. Some parts will be very stable (presumably, the use of logic and (we hope) Friendliness). Some parts will be very fluid (updating the immediate environment). There would be a big intermediate range of viscous-to-firm (general principles for dealing with people, how to improve its intelligence).