Try to find a proof of theorem X, possibly using resources from the external world.
This could be an inspiration for a sci-fi movie: A group of scientists created a superhuman AI and asked it to prove the theorem X, using resources from the external world. The Unfriendly AI quickly took control over most of the planet, enslaved all humans and used them as cheap labor to build more and more microprocessors.
A group of rebels fights against the AI, but is gradually defeated. Just when the AI is trying to kill the protagonist and/or the people dearest to the protagonist, the protagonist finally understands the motivation of the AI. AI does not love humans, but neither it hates them… it is merely trying to get as much computing power as possible to solve the theorem X, which is the task it was programmed to do. So our young hero takes a pen and paper, solves the theorem X, and shows the solution to the AI… which upon seeing the solution prints the output and halts. Humans are saved!
(And if this does not seem like a successful movie scenario, maybe there is something wrong with our intuitions about superhuman AIs.)
It doesn’t halt, because it can’t be perfectly certain that the proof is correct. There are alternative explanations for the computation it implemented reporting that the proof was found, such as corrupted hardware used for checking the proof, or errors in the design of proof-checking algorithms, possibly introduced because of corrupted hardware used to design the proof checker, and so on.
Since it can’t ever be perfectly certain, there is always more to be done in the service of its goal, such as building more redundant hardware and staging experiments to refine its understanding of physical world in order to be able to more fully rely on hardware.
The Unfriendly AI quickly took control over most of the planet, enslaved all humans and used them as cheap labor to build more and more microprocessors.
While this is better than using humans as a power source it still seems like there are more efficient configurations of matter that could achieve this task.
Assuming that the AI builds machines that aren’t directly controlled by the AI itself, it doesn’t have any incentive to build the machines such that they stop working once a proof is found.
Not that realism is a primary objective in most SF movies.
Good point. The AI would probably build the simplest or cheapest machines to do the job, so their behavior when the AI stops giving them commands would be… unspecified explicitly… so they would probably do something meaningless, which would have been meaningful if the AI would still work.
For example, they could contain a code: “if you lose signal from the AI, climb to the highest place you see, until you catch the signal again” (coded assuming they lost the signal because they were deep underground of something), then the machines would just start climbing to the tops of buildings and mountains.
But also, their code could be: “wait until you get the signal again, and while doing that, destroy any humans around you” (coded assuming those humans are probably somehow responsible for the loss of signal), in which case the machines would continue fighting.
The worst case: The AI would assume it could be destroyed (by humans, natural disaster, or anything else), so the machines would have an instruction to rebuild the AI somewhere else. Actually, this seems like a pretty likely case. The new AI would not know about the proof, so it would start fighting again. And if destroyed, a new AI would be built again, and again. -- The original AI does not have motivation to make the proof known to its possible clones.
This could be an inspiration for a sci-fi movie: A group of scientists created a superhuman AI and asked it to prove the theorem X, using resources from the external world. The Unfriendly AI quickly took control over most of the planet, enslaved all humans and used them as cheap labor to build more and more microprocessors.
A group of rebels fights against the AI, but is gradually defeated. Just when the AI is trying to kill the protagonist and/or the people dearest to the protagonist, the protagonist finally understands the motivation of the AI. AI does not love humans, but neither it hates them… it is merely trying to get as much computing power as possible to solve the theorem X, which is the task it was programmed to do. So our young hero takes a pen and paper, solves the theorem X, and shows the solution to the AI… which upon seeing the solution prints the output and halts. Humans are saved!
(And if this does not seem like a successful movie scenario, maybe there is something wrong with our intuitions about superhuman AIs.)
It doesn’t halt, because it can’t be perfectly certain that the proof is correct. There are alternative explanations for the computation it implemented reporting that the proof was found, such as corrupted hardware used for checking the proof, or errors in the design of proof-checking algorithms, possibly introduced because of corrupted hardware used to design the proof checker, and so on.
Since it can’t ever be perfectly certain, there is always more to be done in the service of its goal, such as building more redundant hardware and staging experiments to refine its understanding of physical world in order to be able to more fully rely on hardware.
While this is better than using humans as a power source it still seems like there are more efficient configurations of matter that could achieve this task.
Assuming that the AI builds machines that aren’t directly controlled by the AI itself, it doesn’t have any incentive to build the machines such that they stop working once a proof is found.
Not that realism is a primary objective in most SF movies.
Good point. The AI would probably build the simplest or cheapest machines to do the job, so their behavior when the AI stops giving them commands would be… unspecified explicitly… so they would probably do something meaningless, which would have been meaningful if the AI would still work.
For example, they could contain a code: “if you lose signal from the AI, climb to the highest place you see, until you catch the signal again” (coded assuming they lost the signal because they were deep underground of something), then the machines would just start climbing to the tops of buildings and mountains.
But also, their code could be: “wait until you get the signal again, and while doing that, destroy any humans around you” (coded assuming those humans are probably somehow responsible for the loss of signal), in which case the machines would continue fighting.
The worst case: The AI would assume it could be destroyed (by humans, natural disaster, or anything else), so the machines would have an instruction to rebuild the AI somewhere else. Actually, this seems like a pretty likely case. The new AI would not know about the proof, so it would start fighting again. And if destroyed, a new AI would be built again, and again. -- The original AI does not have motivation to make the proof known to its possible clones.