To clarify, I’m thinking mostly about the strength of the strongest counter-argument, not the quantity of counter-arguments.
But yes, what counts as a strong argument is a bit subjective and a continuum. I wrote this post because of the counter-arguments I know I know of are strong enough to be “strong” by my standards.
Personally my strongest counter-argument is “humanity actually will recognize the x-risk in time to take alignment seriously, delaying the development of ASI if necessary”, but even that isn’t backed up by too much evidence (the only previous example I know of is when we avoided nuclear holocaust).
The strongest argument against AI doom I can imagine runs as follows:
AI can kill all humans for two main reasons: to (a) prevent a threat to itself and (b) to get human’s atoms.
But:
(a)
AI will not kill humans as a threat before it creates powerful human-independent infrastructure (nanotech) as in that case, it will run out of electricity etc.
AI will also not kill humans after it creates nanotech, as we can’t destroy nanotech (even with nukes).
Thus, AI will not kill humans to prevent the threat neither before, nor after nanotech, – so it will never happens for this reason.
(b)
Human atoms constitute 10E-24 of all atoms in the Solar system.
Humans may have small instrumental value for trade with aliens, for some kinds of work or as training data sources.
Even a small instrumental value of humans will be larger than the value of their atoms, as the value of atoms is very-very small.
Humans will not be killed for atoms.
Thus humans will not be killed either as a threat or for atoms.
But there are other ways how AI catastrophe can kill everybody: wrongly aligned AI performs wireheading, Singleton halts, or there will be war between several AIs. Each of this risk is not necessary outcome.But together they have high probability mass.
they are not a big threat, but they are annoying (it costs resources to fix the damage they do)
a side effect of e.g. changing the atmosphere
Also, the AI may destroy human civilization without exterminating all humans, e.g. by taking away most of our resources. If the civilization collapses because the cities and factories are taken over by robots, most humans will starve to death, but maybe 100000 will survive in various forests as hunters and gatherers, with no chance to develop civilization again in the future… that’s also quite bad.
It all collapses to the (2) “atoms utility” vs “human instrumental utility.” Preventing starvation or pollution effect for a large group of humans is relatively cheap. Just put all them on a large space station, may be 1 km long.
But disempowerment of humanity and maybe even Earth-destruction are far more likely. Even if we will get small galactic empire of 1000 stars, but will live there as pets devoted any power about Universe future, it is not very good outcome.
These don’t seem very relevant counterarguments, I think literally all are from people who believe that AGI is an extinction-level threat soon facing our civilization.
Perhaps you mean “>50% of extinction-level bad outcomes” but I think that the relevant alternative viewpoint that would calm someone is not that the probability is only 20% or something, but is “this is not an extinction-level threat and we don’t need to be worried about it”, for which I have seen no good argument for (that engages seriously with any misalignment concerns).
Well, I was asking because I found Yudkowsky’s model of AI doom far more complete than any other model of the long term consequences of AI. So the point of my original question is “how frequently is a model that is far more complete than it’s competitors wrong?”.
But yeah, even something as low as 1% chance of doom demands very large amount of attentions from the human race (similar to the amount of attention we assigned to the possibility of nuclear war).
(That said, I do think the specific value of p(doom) is very important when deciding which actions to take, because it effects the strategic considerations in the play to your outs post.)
Not an answer to your question, but I think there are plenty of good counterarguments against doom. A few examples:
My Objections to “We’re All Gonna Die with Eliezer Yudkowsky”
Counterarguments to the basic AI x-risk case
Where I agree and disagree with Eliezer
Evolution provides no evidence for the sharp left turn
Deceptive Alignment is <1% Likely by Default
Some of my disagreements with List of Lethalities
Evolution is a bad analogy for AGI: inner alignment
Contra Yudkowsky on AI Doom
Contra Yudkowsky on Doom from Foom #2
A Contra AI FOOM Reading List
Likelihood of discontinuous progress around the development of AGI
To clarify, I’m thinking mostly about the strength of the strongest counter-argument, not the quantity of counter-arguments.
But yes, what counts as a strong argument is a bit subjective and a continuum. I wrote this post because of the counter-arguments I know I know of are strong enough to be “strong” by my standards.
Personally my strongest counter-argument is “humanity actually will recognize the x-risk in time to take alignment seriously, delaying the development of ASI if necessary”, but even that isn’t backed up by too much evidence (the only previous example I know of is when we avoided nuclear holocaust).
What do you think are the strongest arguments in that list, and why are they weaker than a vague “oh maybe we’ll figure it out”?
Hmm, Where I agree and disagree with Eliezer actually has some pretty decent counter-arguments, at least in the sense of making things less certain.
However, I still think that there’s a problem of “the NN writes a more traditional AGI that is capable of foom and runs it”.
The strongest argument against AI doom I can imagine runs as follows:
AI can kill all humans for two main reasons: to (a) prevent a threat to itself and (b) to get human’s atoms.
But:
(a)
AI will not kill humans as a threat before it creates powerful human-independent infrastructure (nanotech) as in that case, it will run out of electricity etc.
AI will also not kill humans after it creates nanotech, as we can’t destroy nanotech (even with nukes).
Thus, AI will not kill humans to prevent the threat neither before, nor after nanotech, – so it will never happens for this reason.
(b)
Human atoms constitute 10E-24 of all atoms in the Solar system.
Humans may have small instrumental value for trade with aliens, for some kinds of work or as training data sources.
Even a small instrumental value of humans will be larger than the value of their atoms, as the value of atoms is very-very small.
Humans will not be killed for atoms.
Thus humans will not be killed either as a threat or for atoms.
But there are other ways how AI catastrophe can kill everybody: wrongly aligned AI performs wireheading, Singleton halts, or there will be war between several AIs. Each of this risk is not necessary outcome.But together they have high probability mass.
Other reasons to kill humans:
they are not a big threat, but they are annoying (it costs resources to fix the damage they do)
a side effect of e.g. changing the atmosphere
Also, the AI may destroy human civilization without exterminating all humans, e.g. by taking away most of our resources. If the civilization collapses because the cities and factories are taken over by robots, most humans will starve to death, but maybe 100000 will survive in various forests as hunters and gatherers, with no chance to develop civilization again in the future… that’s also quite bad.
It all collapses to the (2) “atoms utility” vs “human instrumental utility.” Preventing starvation or pollution effect for a large group of humans is relatively cheap. Just put all them on a large space station, may be 1 km long.
But disempowerment of humanity and maybe even Earth-destruction are far more likely. Even if we will get small galactic empire of 1000 stars, but will live there as pets devoted any power about Universe future, it is not very good outcome.
These don’t seem very relevant counterarguments, I think literally all are from people who believe that AGI is an extinction-level threat soon facing our civilization.
Perhaps you mean “>50% of extinction-level bad outcomes” but I think that the relevant alternative viewpoint that would calm someone is not that the probability is only 20% or something, but is “this is not an extinction-level threat and we don’t need to be worried about it”, for which I have seen no good argument for (that engages seriously with any misalignment concerns).
Well, I was asking because I found Yudkowsky’s model of AI doom far more complete than any other model of the long term consequences of AI. So the point of my original question is “how frequently is a model that is far more complete than it’s competitors wrong?”.
But yeah, even something as low as 1% chance of doom demands very large amount of attentions from the human race (similar to the amount of attention we assigned to the possibility of nuclear war).
(That said, I do think the specific value of p(doom) is very important when deciding which actions to take, because it effects the strategic considerations in the play to your outs post.)