Even if we assume that’s true (it seems reasonable, though less capable AIs might blunder on this point, whether by failing to understand the need to act nice, failing to understand how to act nice or believing themselves to be in a winning position before they actually are), what does an AI need to do to get in a winning position? And how easy is it to make those moves without them being seen as hostile?
An unfriendly AI can sit on its server saying “I love mankind and want to serve it” all day long, and unless we have solid neural net interpretability or some future equivalent, we might never know it’s lying. But not even superintelligence can take over the world just by saying “I love mankind”. It needs some kind of lever. Maybe it can flash its message of love at just the right frequency to hack human minds, or to invoke some sort of physical effect that let’s it move matter. But whether it can or not depends on facts about physics and psychology, and if that’s not an option, it doesn’t become an option just because it’s a superintelligence trying it.
Even if we assume that’s true (it seems reasonable, though less capable AIs might blunder on this point, whether by failing to understand the need to act nice, failing to understand how to act nice or believing themselves to be in a winning position before they actually are), what does an AI need to do to get in a winning position? And how easy is it to make those moves without them being seen as hostile?
An unfriendly AI can sit on its server saying “I love mankind and want to serve it” all day long, and unless we have solid neural net interpretability or some future equivalent, we might never know it’s lying. But not even superintelligence can take over the world just by saying “I love mankind”. It needs some kind of lever. Maybe it can flash its message of love at just the right frequency to hack human minds, or to invoke some sort of physical effect that let’s it move matter. But whether it can or not depends on facts about physics and psychology, and if that’s not an option, it doesn’t become an option just because it’s a superintelligence trying it.
It does, and a superintelligence will understand those facts better than we do.