It seems to me like the simplest way to solve friendliness is: “Ok AI, I’m friendly so do what I tell you to do and confirm with me before taking any action.” It is much simpler to program a goal system that responds to direct commands than to somehow try to infuse ‘friendliness’ into the AI.
As was pointed out, this might not have the consequences one wants. However, even if that wasn’t true, I’d still be leery of this option—this’d effectively be giving one human unlimited power. History has shown that people who are given unlimited power (or something close to it) tend to easily misuse it, even if they started out with good intentions.
As was pointed out, this might not have the consequences one wants. However, even if that wasn’t true, I’d still be leery of this option—this’d effectively be giving one human unlimited power. History has shown that people who are given unlimited power (or something close to it) tend to easily misuse it, even if they started out with good intentions.