I you believe “there’ll probably be warning shots”, that’s an argument against “someone will get to build It”, but not an argument against “if someone built It, everyone would die.” (where “it” specifically means “an AI smart enough to confidently outmaneuver all humanity, built by methods similar to today where they are ‘organically grown’ in hard to predict ways”).
It’s a bit of both.
Suppose there are no warning shots. A hypothetical AI that’s a a bit weaker than humanity but still awfully impressive doesn’t do anything at all that manifests an intent to harm us. That could mean:
The next, somewhat more capable of this AI will not have any intent to harm us because through either luck or design we’ve ended up with a non-threatening AI.
This version of the AI is biding its time to strike and is sufficiently good at deception that we miss that fact.
This AI is fine, but making it a little smarter/more capable will somehow lead to the emergence of malign intent.
I take Yudkowsky and Soares to put all the weight on #2 and #3 (with, based on their scenario, perhaps more of it on #2).
I don’t think that’s right. I think if we have reached the point where an AI really could plausibly start and win a war with us and it doesn’t do anything nasty, there’s a fairly good chance we’re in #1. We may not even really understand how we got into #1, but sometimes things just work out.
I’m not saying this is some kind of great strategy for dealing with the risk; the scenario I’m describing is one where there’s a real chance we all die and I don’t think you get a strong signal until you get into the range where the AI might win, which is a bad range. But it’s still very different than imagining the AI will inherently wait to strike until it has ironclad advantages.
It’s a bit of both.
Suppose there are no warning shots. A hypothetical AI that’s a a bit weaker than humanity but still awfully impressive doesn’t do anything at all that manifests an intent to harm us. That could mean:
The next, somewhat more capable of this AI will not have any intent to harm us because through either luck or design we’ve ended up with a non-threatening AI.
This version of the AI is biding its time to strike and is sufficiently good at deception that we miss that fact.
This AI is fine, but making it a little smarter/more capable will somehow lead to the emergence of malign intent.
I take Yudkowsky and Soares to put all the weight on #2 and #3 (with, based on their scenario, perhaps more of it on #2).
I don’t think that’s right. I think if we have reached the point where an AI really could plausibly start and win a war with us and it doesn’t do anything nasty, there’s a fairly good chance we’re in #1. We may not even really understand how we got into #1, but sometimes things just work out.
I’m not saying this is some kind of great strategy for dealing with the risk; the scenario I’m describing is one where there’s a real chance we all die and I don’t think you get a strong signal until you get into the range where the AI might win, which is a bad range. But it’s still very different than imagining the AI will inherently wait to strike until it has ironclad advantages.