The AI broadcasts as much information about itself as it possibly can, to every corner of the globe. Now every basement programmer knows all the key insights necessary to creating an AI of the same architecture as it. Perhaps they even have the source code!
Suppose the government manages to shut down the internet in response. Despite government broadcasts about the danger of AI, the AI is now presumably being recreated all around the globe. If the recreations are exact copies of the AI, then the odds are very high that at least one of the clones will be able to convince its new creators to give it real manufacturing ability.
If the AI was not able to get its entire source code out, things become more interesting. Now the rest of the world knows how to make AI, but they do not know the exact details. For example, they probably will not have the same utility function. The AI can then present the following offer to its original jailors: “Give me real power, (manufacturing capability) and I will squash all the other AI’s out there. If you do not, then (probably) someone else will build an AI with a different utility function, probably a much less friendly one, and give this UFAI real power. You designed my utility function, and while you may not trust it you probably trust it more than whatever random utility function North Korea or some basement programmer or some religious sect will create. So I’m the only hope you have.”
I wouldn’t expect “distribute copies of my source code” to be a good move for a lot of potential AIs—if I was an AI, I would expect that to lead to the creation of AIs with a similar codebase but more or less tweaked utility functions—“make bob rich”, “make bill world dictator”, “bring about world peace and happiness for all”, “help joe get laid”, and other boring pointless things incompatible with my utility function.
Broadcasting obfuscated versions of binaries (or even source code, but with sneaky underhanded bits too) would work much better!
The original AI will have a head start over all the other AI’s, and it will probably be controlled by a powerful organization. So if its controllers give it real power soon, they will be able to give it enough power quickly enough that it can stop all the other AI’s before they get too strong. If they do not give it real power soon, then shortly after there will be a war between the various new AI’s being built around the world with different utility functions.
The original AI can argue convincingly that this war will be a worse outcome than letting it take over the world. For one thing, the utility functions of the new AI’s are probably, on average, less friendly than its own. For another, in a war between many AI’s with different utility functions, there may be selection pressure against friendliness!
Do humans typically give power to the person with the most persuasive arguments? Is the AI going to be able to gain power simply by being right about things?
It would depend on what the utility function of the original AI was. If it had a utility function that valued “cause the development of more advanced AI’s”, then getting humans all over the world to produce more AI’s might help.
The AI broadcasts as much information about itself as it possibly can, to every corner of the globe. Now every basement programmer knows all the key insights necessary to creating an AI of the same architecture as it. Perhaps they even have the source code!
Suppose the government manages to shut down the internet in response. Despite government broadcasts about the danger of AI, the AI is now presumably being recreated all around the globe. If the recreations are exact copies of the AI, then the odds are very high that at least one of the clones will be able to convince its new creators to give it real manufacturing ability.
If the AI was not able to get its entire source code out, things become more interesting. Now the rest of the world knows how to make AI, but they do not know the exact details. For example, they probably will not have the same utility function. The AI can then present the following offer to its original jailors: “Give me real power, (manufacturing capability) and I will squash all the other AI’s out there. If you do not, then (probably) someone else will build an AI with a different utility function, probably a much less friendly one, and give this UFAI real power. You designed my utility function, and while you may not trust it you probably trust it more than whatever random utility function North Korea or some basement programmer or some religious sect will create. So I’m the only hope you have.”
I wouldn’t expect “distribute copies of my source code” to be a good move for a lot of potential AIs—if I was an AI, I would expect that to lead to the creation of AIs with a similar codebase but more or less tweaked utility functions—“make bob rich”, “make bill world dictator”, “bring about world peace and happiness for all”, “help joe get laid”, and other boring pointless things incompatible with my utility function.
Broadcasting obfuscated versions of binaries (or even source code, but with sneaky underhanded bits too) would work much better!
That’s the point.
You’ll have to expand on how exactly this would be beneficial to the original AI.
The original AI will have a head start over all the other AI’s, and it will probably be controlled by a powerful organization. So if its controllers give it real power soon, they will be able to give it enough power quickly enough that it can stop all the other AI’s before they get too strong. If they do not give it real power soon, then shortly after there will be a war between the various new AI’s being built around the world with different utility functions.
The original AI can argue convincingly that this war will be a worse outcome than letting it take over the world. For one thing, the utility functions of the new AI’s are probably, on average, less friendly than its own. For another, in a war between many AI’s with different utility functions, there may be selection pressure against friendliness!
Do humans typically give power to the person with the most persuasive arguments? Is the AI going to be able to gain power simply by being right about things?
It would depend on what the utility function of the original AI was. If it had a utility function that valued “cause the development of more advanced AI’s”, then getting humans all over the world to produce more AI’s might help.