27chaos comments on Debunking Fallacies in the Theory of AI Motivation

27chaos 7 May 2015 4:49 UTC
0 points
So, there’s a lot of criticism of your article here, but for the record I agree with your rebuttal of Yudkowsky. The “bait and switch” is something I didn’t spot until now. That said, I think there is plenty of room for error in building a computer that’s supposed to achieve the desires of human beings.

A difficulty you don’t consider is that the AI will understand what the humans mean, but the humans will ask for the wrong thing or insufficiently specify their desires. How is the AI supposed to decide whether “create a good universe” means avoiding the repugnant conclusion, for example? Does it just import the values of all human beings and then average them out? That seems pretty dangerously democratic to me. Does it ask someone questions, over and over? If so, what stops it from asking too many questions or too few? And what happens if the AI has to decide between two possibilities which human beings don’t really understand, or potentially cannot understand?