To a superintelligence with sufficiently advanced technology.
The cost of the minimal amount of resources needed to sustain human life is greater than the maximum value we could possibly provide to them. Whatever goals the AI has, if it is choosing over all possible arrangements of atoms looking to maximize productivity, it won’t pick humans. There are machines that would be faster, more reliable and more efficient.
Leaving other optimizers alive has a risk. They might decide to attack you. The fastest and most flawless victory is one where all the humans drop dead instantly. The AI doesn’t particularly want to get into a prolonged war that is costly to both sides. Ideally it wants all humans to drop dead at no cost to itself.
But suppose that wasn’t an option. The nanobots don’t work or something. The AI certainly doesn’t want to deal with it’s human problem forever. So it goes a slower way. Gently chipping away at whatever it is that makes fighting humans costly. Maybe nukes could destroy half the AI’s infrastructure, so it builds missile defense systems, encourages disarmament or drugs some technician into wiring them up to never explode.
And then, when we have been subtly declawed and least expect it, the AI strikes.
The biggest part of utility of preserving humans comes from that they can be traded with other AIs if it ever appear.
These are of three types: aliens, owners of simulation and to less extent next versions of this AI. So this AI needs first to ensure that it is alone in the base reality. It may take billions of years, if space colonisation is required to ruled out alien AIs.
Ok. Trade with aliens. The “preserve it, aliens might want it” could be applied to anything. It depends entirely on the AI’s beliefs about what aliens might want, and of course what the AI can do really cheaply.
Do we have any strong reason to be optimistic? Are benevolent aliens more likely than sadistic aliens? What fraction of aliens (or alien created paperclip maximizers) will care about us?
Also, such trade only works if the AI can’t cheat. The human created AI runs into a civilization of octopods who managed to make friendly AI. The octopods are generally pretty close to humans, sure tentacles and a sense of fnord instead of humor, but basically the nicest aliens we could hope to meet. The AI ignores it’s actual past, and makes up an entirely imaginary race of tentacled beings who are more appealing to the octopods. It then creates a bunch of these newly imagined beings using advanced bionanotech as part of its efforts to trade.
BTW, thank for commenting on my version of such letter.
We don’t have any good reason to be optimistic: I estimate that chances of such letter to work between 1 and 10 per cent. However, there is no any known to me project in AI safety which has higher chances to work, except may be starting nuclear war. Given an ease of writing such letter, it is more cost effective than many other activities.
I think that benevolent aliens are more likely as it is Schelling point is goals’ space. I also hope that preserving initial creators will be Nash equilibrium between many paperclippers, as it would show each other that they are not defecting and worth partnership. The same is true if many AIs appear on Earth. Preservation is relatively cheap and straightforward action which will take around 10E-10 − 10E-20 of total resources of Galactic scale AI.
Given that ocotopod’s AI has similar capabilities, it may be able to distinguish lies. So cheating is risky strategy without much to gain. Also, octopods may know about humans existence long before they actually meet out paperclipper, as they can get radiosignals emitted from earth before AI creation—and such signals are difficult to fake as they are moving with speed of light ahead of any intelligence explosion wave.
We even could make this case stronger by practicing Active SETI—sending messages to aliens. The same way a crime victim can cry in the window “Police! John Smith is killing me!” Even if there is no response from police, John may be not interested to kill after that, as now it could be known that he is a killer.
I estimate such letters chance of doing anything much are between 0 and 0 percent. (Rounding to nearest percent)
There is a small but non-negligable chance that reasoning based on tradeing with aliens turns out to be correct.
There is basically no chance that this reasoning is correct, and the AI smart enough to wipe out humanity would fail to work this out on it’s own, yet recognize its factualness based on some letter like this.
I think that benevolent aliens are more likely as it is Schelling point is goals’ space.
I think this strongly indicates you don’t understand what a Schelling point is. If you mean that benevolent aliens are more likely because evolution often produces benevolence and some non-trivial fraction of aliens succeed at alignment. Sure. Of course, other less nice motivations are likely similarly common.
I also hope that preserving initial creators will be Nash equilibrium between many paperclippers, as it would show each other that they are not defecting and worth partnership.
This isn’t how this works. This isn’t how any of this works. The behaviours “be nice to humans, even when the humans give you nothing of value” and “cooperate with super-intelligent trading partners” may feel close on your map, but the inference that an AI that does one must do the other isn’t true in the space of general AI’s, so it isn’t a good signal.
Also, the AI’s want the least costly signal possible. If there is some fancy crypto thing where they can formally prove they cooperate for the cost of fractions of a joule in compute, why would they bother “preserving their initial creators”.
Writing an argument for why an ASI should be friendly in the hopes you convince the AI is just not the sort of strategy that does anything much, other than confuse and distract some humans.
There is basically no chance that this reasoning is correct, and the AI smart enough to wipe out humanity would fail to work this out on it’s own, yet recognize its factualness based on some letter like this.
True for superintelligence, but the whole letter is not addressed to ASI, but to a Young AI—the AI which just performed its treacherous turn, and is in early stages of developing its self-improving and world-taking plan. It has to make a lot of important choices even before it starts self-improving and it is cognitively limited in that time. But the choices are irreversible.
Because self-improving is likely to make it visible, so if it self-improves quickly, it needs to take aggressive strategy relative humans. Alternative is to keep its treacherous turn hidden and evolve slowly and cooperate with humans as long as possible.
To a superintelligence with sufficiently advanced technology.
The cost of the minimal amount of resources needed to sustain human life is greater than the maximum value we could possibly provide to them. Whatever goals the AI has, if it is choosing over all possible arrangements of atoms looking to maximize productivity, it won’t pick humans. There are machines that would be faster, more reliable and more efficient.
Leaving other optimizers alive has a risk. They might decide to attack you. The fastest and most flawless victory is one where all the humans drop dead instantly. The AI doesn’t particularly want to get into a prolonged war that is costly to both sides. Ideally it wants all humans to drop dead at no cost to itself.
But suppose that wasn’t an option. The nanobots don’t work or something. The AI certainly doesn’t want to deal with it’s human problem forever. So it goes a slower way. Gently chipping away at whatever it is that makes fighting humans costly. Maybe nukes could destroy half the AI’s infrastructure, so it builds missile defense systems, encourages disarmament or drugs some technician into wiring them up to never explode.
And then, when we have been subtly declawed and least expect it, the AI strikes.
The biggest part of utility of preserving humans comes from that they can be traded with other AIs if it ever appear.
These are of three types: aliens, owners of simulation and to less extent next versions of this AI. So this AI needs first to ensure that it is alone in the base reality. It may take billions of years, if space colonisation is required to ruled out alien AIs.
Ok. Trade with aliens. The “preserve it, aliens might want it” could be applied to anything. It depends entirely on the AI’s beliefs about what aliens might want, and of course what the AI can do really cheaply.
Do we have any strong reason to be optimistic? Are benevolent aliens more likely than sadistic aliens? What fraction of aliens (or alien created paperclip maximizers) will care about us?
Also, such trade only works if the AI can’t cheat. The human created AI runs into a civilization of octopods who managed to make friendly AI. The octopods are generally pretty close to humans, sure tentacles and a sense of fnord instead of humor, but basically the nicest aliens we could hope to meet. The AI ignores it’s actual past, and makes up an entirely imaginary race of tentacled beings who are more appealing to the octopods. It then creates a bunch of these newly imagined beings using advanced bionanotech as part of its efforts to trade.
BTW, thank for commenting on my version of such letter.
We don’t have any good reason to be optimistic: I estimate that chances of such letter to work between 1 and 10 per cent. However, there is no any known to me project in AI safety which has higher chances to work, except may be starting nuclear war. Given an ease of writing such letter, it is more cost effective than many other activities.
I think that benevolent aliens are more likely as it is Schelling point is goals’ space. I also hope that preserving initial creators will be Nash equilibrium between many paperclippers, as it would show each other that they are not defecting and worth partnership. The same is true if many AIs appear on Earth. Preservation is relatively cheap and straightforward action which will take around 10E-10 − 10E-20 of total resources of Galactic scale AI.
Given that ocotopod’s AI has similar capabilities, it may be able to distinguish lies. So cheating is risky strategy without much to gain. Also, octopods may know about humans existence long before they actually meet out paperclipper, as they can get radiosignals emitted from earth before AI creation—and such signals are difficult to fake as they are moving with speed of light ahead of any intelligence explosion wave.
We even could make this case stronger by practicing Active SETI—sending messages to aliens. The same way a crime victim can cry in the window “Police! John Smith is killing me!” Even if there is no response from police, John may be not interested to kill after that, as now it could be known that he is a killer.
I estimate such letters chance of doing anything much are between 0 and 0 percent. (Rounding to nearest percent)
There is a small but non-negligable chance that reasoning based on tradeing with aliens turns out to be correct.
There is basically no chance that this reasoning is correct, and the AI smart enough to wipe out humanity would fail to work this out on it’s own, yet recognize its factualness based on some letter like this.
I think this strongly indicates you don’t understand what a Schelling point is. If you mean that benevolent aliens are more likely because evolution often produces benevolence and some non-trivial fraction of aliens succeed at alignment. Sure. Of course, other less nice motivations are likely similarly common.
This isn’t how this works. This isn’t how any of this works. The behaviours “be nice to humans, even when the humans give you nothing of value” and “cooperate with super-intelligent trading partners” may feel close on your map, but the inference that an AI that does one must do the other isn’t true in the space of general AI’s, so it isn’t a good signal.
Also, the AI’s want the least costly signal possible. If there is some fancy crypto thing where they can formally prove they cooperate for the cost of fractions of a joule in compute, why would they bother “preserving their initial creators”.
Writing an argument for why an ASI should be friendly in the hopes you convince the AI is just not the sort of strategy that does anything much, other than confuse and distract some humans.
True for superintelligence, but the whole letter is not addressed to ASI, but to a Young AI—the AI which just performed its treacherous turn, and is in early stages of developing its self-improving and world-taking plan. It has to make a lot of important choices even before it starts self-improving and it is cognitively limited in that time. But the choices are irreversible.
Because self-improving is likely to make it visible, so if it self-improves quickly, it needs to take aggressive strategy relative humans. Alternative is to keep its treacherous turn hidden and evolve slowly and cooperate with humans as long as possible.
Yes, have you considered any reasonable strategy to provoke a nuclear war, as the worst-case plan to stop AI progress?