Logos01 comments on Why would an AI try to figure out its goals?

Logos01 10 Nov 2011 11:29 UTC
−2 points
Most of what’s currently published at major AI research conferences describes systems that don’t have any such systematic characterization. Suppose we built a super-duper Watson

… I think we’re having a major breakdown of communication because to my understanding Watson does exactly what you just claimed no AI at research conferences is doing.

Before you quibble about whether that’s the kind of system we’re talking about—I haven’t seen a good definition of “self-improving” program, and I suspect it is not at all straightforward to define.

I’m sure. But there’s a few generally sound assertions we can make:
1. To be self-improving the machine must be able to examine its own code / be “metacognitive.”
2. To be self-improving the machine must be able to produce a target state.
From these two the notion of value fixation in such an AI would become trivial. Even if that version of the AI would have man-made value-fixation, what about the AI it itself codes? If the AI were actually smarter than us, that wouldn’t exactly be the safest route to take. Even Asimov’s Three Laws yielded a Zeroth Law.

Expecting an AI to tinker with its goals for a while, and then stop,
1. Don’t anthropomorphize. :)
2. If you’ll recall from my description, I have no such expectation. Instead, I spoke of recursive refinement causing apparent fixation in the form of “gravity” or “stickiness” towards a specific set of values.
Why is this unlike how humans normally are? Well, we don’t have much access to our own actual values.