nigerweiss comments on The challenges of bringing up AIs

nigerweiss 10 Dec 2012 22:39 UTC
2 points
0
An AGI that is not either deeply neuromorphic or possessing a well-defined and formally stable utility function sounds like… frankly one of the worst ideas I’ve ever heard. I’m having difficulty imagining a way you could demonstrate the safety of such a system, or trust it enough at any point to give it enough resources to learn. Considering that the fate of intelligent life in our future light cone may hang in the balance, standards of safety must obviously be very high! Intuition is, I’m sorry, simply not an acceptable criteria on which to wager at least billions, and perhaps trillions of lives. The expected utility math does not wash if you actually expect OpenCog to work.

On a more technical level, human values are broadly defined as some function over a typical human brain. There may be some (or many) optimizations possible, but not such that we can rely on them. So, for a really good model of human values, we should not expect to need less than the entropy of a human brain. In other words, nobody, whether they’re Eliezer Yudkowsky with his formalist approach or you, is getting away with less than about ten petabytes of good training samples. Those working on uploads can skip this step entirely, but neuromorphic AI is likely to be fundamentally less useful.

And this assumes that every bit of evidence can be mapped directly to a bit in a typical human brain map. In reality, for a non-FOOMed AI, the mapping it likely to be many orders of magnitude less efficient. I suspect, but cannot demonstrate right now, that a formalist approach starting with a clean framework along the lines of AIXI is going to be more efficient. Quite aside from that, even assuming you can acquire enough data to train your machine reliably, then you still need it to do… something. Human values include a lot of unpleasant qualities. Simply giving it human values and then allowing it to grow to superhuman intellect is grossly unsafe. Ted Bundy had human values. If your plan is to train it on examples of only nice people, then you’ve got a really serious practical problem of how to track down >10 petabytes of really good data on the lives of saints. A formalist approach like CEV, for all the things that bug me about it, simply does not have that issue, because its utility function is defined as functions of the observed values of real humans.

In other words, for a system that’s as alien as the architecture of OpenCog, even if we assume that the software is powerful and general enough to work (which I’m in no way convinced of), attempting to inculate it with human values is extremely difficult, dangerous, and just plan unethical.