It would be evidence against AGI being an existential risk, but not strong evidence. The strength would depend upon lots of other factors of the scenario, such as:
How long have we coexisted with AGI?
Has AGI improved to superintelligence?
Have we proven that the AGIs can’t feasibly improve to superintelligence?
Were there any near misses before coexistence?
Were the AGIs all developed via planned alignment techniques?
A long timescale for (1) implies that it’s less likely to just be a temporary state of affairs between AGI development and some AGI wrecking us.
If (2) doesn’t hold, then we still don’t know whether an superintelligent AGI will wreck us as soon as it develops. In conjunction with a long coexistence time, it might show that spontaneous progression to ASI is less likely than we thought which would be a point against x-risk. If there is a superintelligence that we are coexisting with, substantial risk may remain if it is not legible to us. It may be legible to us by design, or perhaps by a (post-)human intelligence explosion.
If (3) held, that would be good evidence against one major branch of x-risk. It might be due to diminishing returns on computation power, other physical limits, pivotal acts, or by aligned design. The first two would be much stronger evidence against general x-risk, but even the latter two would be evidence that limiting the risk is plausible.
In (4), zero near-misses would be only weak evidence against x-risk, since we may have just got lucky. One or two near-misses would be evidence in favour of x-risk, and many near-misses could be evidence against x-risk, or it might be anthropic selection.
If the AGIs were all developed via planned alignment in (5), then it does at least say that alignment seems to be possible, which reduces one branch of x-risk. It doesn’t say much about what happens if AGI is developed without those techniques, and risks from misaligned AGI might still be in such a world’s future.
These are just a few “think about it for 5 minutes” complications of evaluating evidence for and against existential risk from artificial general intelligence. A lot of it depends upon what we learn from our experience with AGI in this hypothetical future, and we can’t deduce very much of it in advance.
I was wondering how you would interpret a future in which we have machines that can do everything we can, and yet we are still alive.
Would you care to tell us just a fraction more about this future? Are we at their mercy? Are they at our mercy? Do we coexist? Do we live apart?
The LW argument that AI will kill us all, is that it will shoot so far past human intelligence that it will completely dominate the world, and that when it does so, it will be governed by goals having no relationship to human well-being.
Is there a particular reason why this doesn’t come to pass in your scenario?
Quick heads up from a moderator. I feel this post and some of your other comments don’t quite meet the standards for AI discussion that we’re now targeting. With the large influx of users interest in AI, we’re starting to put more mod effort into ensuring standards are high, that is, that content is well-reasoned and useful.
Unfortunately, it’s hard to neatly specify “the standard”, so some quick recommendations:
Check out Stampy, a wiki project and community about AI Safety/Alignment
Read The Sequences as a general boost to rational engagement on tricky topics
For now, I’ve limited you to one post and comment per day. If the mods observe that your contributions are substantial, we will remove the limit.
Thanks and good luck!
(forgive the public moderation strategy – we want to be transparent and accountable to all users and are moving away from DMs where had been sending these messages)
Eliezer’s beliefs IIRC have us at <5% of survival (could be wrong here, just going off memory). I have a lot of uncertainty over both timelines and doom myself, but my guesses are in the range of 30-60% (and these are just the centre two quartiles) chance of something catastrophic in the next 10 years. Beyond that it becomes even more uncertain to me because I am optimistic about a lot of alignment work bar the time constraint.
For what it’s worth, Eliezer’s beliefs tend to be on the more extreme end of pessimistic among alignment researchers. On the other end you have people like Paul.
I wouldn’t use the word extremely, but I do think we’d have to be pretty lucky to make it. (I’m not sure if you’re envisioning a specific future when you say living side by side with AGIs though).
I’m not super clear on what you’re asking, but some general thoughts I had on reading your question: There’s almost always a future in which some scenario comes to pass. There’s probably some Everett branch of the future where the sun doesn’t rise anymore. So in isolation, being able to see a specifically sampled future where some scenario plays out shouldn’t be evidence one way or the other for anything that we don’t consider impossible (if there were a future where the fundamental laws of physics were definitely being violated—like if global entropy kept decreasing—that would be a different story).
A different context is where we see a future randomly sampled across all possible futures. If that future then shows a world where we’re alive and our CEV is achieved, then that would probably be good evidence for updating downward on AGI risk if you have a high P(doom). Someone with a P(doom) of 10% wouldn’t expect to see a future like that with less than 90% certainty (barring other outcomes, but I’m abstracting them away for simplicity, and it shouldn’t affect the central point), so that being the randomly sampled future should only update their estimates by a little (someone else can probably give the exact numbers here), and it’d still be really worth it to work to prevent 10% chance of doom. If you had P(doom) of >50% on the other hand, then this would be stronger evidence to update your priors on, to a degree given by the same equation as if you had P(doom).
If, on the other hand, you were talking about a deterministic future (ignoring quantum considerations, just stuff happening at a macro level), and we could know with some certainty that that future was good—then I’d still ask whether that’s conditional on our working currently to prevent other futures, or whether it was the default outcome. If the former, that means there probably still a strong case for why AGI is dangerous, but we were up to the task of solving it. Concerns about nuclear weapons don’t go away just because safety protocols are sufficient—they probably decrease our practical worries, but the intrinsic concern about their default destructiveness would, and probably should, remain. If it’s the latter, on the other hand, then yeah I agree that we should update down hard on AGI risk. But speculating that way doesn’t seem more useful than as a different framing of priors.
It would be evidence against AGI being an existential risk, but not strong evidence. The strength would depend upon lots of other factors of the scenario, such as:
How long have we coexisted with AGI?
Has AGI improved to superintelligence?
Have we proven that the AGIs can’t feasibly improve to superintelligence?
Were there any near misses before coexistence?
Were the AGIs all developed via planned alignment techniques?
A long timescale for (1) implies that it’s less likely to just be a temporary state of affairs between AGI development and some AGI wrecking us.
If (2) doesn’t hold, then we still don’t know whether an superintelligent AGI will wreck us as soon as it develops. In conjunction with a long coexistence time, it might show that spontaneous progression to ASI is less likely than we thought which would be a point against x-risk. If there is a superintelligence that we are coexisting with, substantial risk may remain if it is not legible to us. It may be legible to us by design, or perhaps by a (post-)human intelligence explosion.
If (3) held, that would be good evidence against one major branch of x-risk. It might be due to diminishing returns on computation power, other physical limits, pivotal acts, or by aligned design. The first two would be much stronger evidence against general x-risk, but even the latter two would be evidence that limiting the risk is plausible.
In (4), zero near-misses would be only weak evidence against x-risk, since we may have just got lucky. One or two near-misses would be evidence in favour of x-risk, and many near-misses could be evidence against x-risk, or it might be anthropic selection.
If the AGIs were all developed via planned alignment in (5), then it does at least say that alignment seems to be possible, which reduces one branch of x-risk. It doesn’t say much about what happens if AGI is developed without those techniques, and risks from misaligned AGI might still be in such a world’s future.
These are just a few “think about it for 5 minutes” complications of evaluating evidence for and against existential risk from artificial general intelligence. A lot of it depends upon what we learn from our experience with AGI in this hypothetical future, and we can’t deduce very much of it in advance.
Would you care to tell us just a fraction more about this future? Are we at their mercy? Are they at our mercy? Do we coexist? Do we live apart?
The LW argument that AI will kill us all, is that it will shoot so far past human intelligence that it will completely dominate the world, and that when it does so, it will be governed by goals having no relationship to human well-being.
Is there a particular reason why this doesn’t come to pass in your scenario?
Quick heads up from a moderator. I feel this post and some of your other comments don’t quite meet the standards for AI discussion that we’re now targeting. With the large influx of users interest in AI, we’re starting to put more mod effort into ensuring standards are high, that is, that content is well-reasoned and useful.
Unfortunately, it’s hard to neatly specify “the standard”, so some quick recommendations:
Check out Stampy, a wiki project and community about AI Safety/Alignment
Comment on the most recent thread for basic questions about AI safety
Read The Sequences as a general boost to rational engagement on tricky topics
For now, I’ve limited you to one post and comment per day. If the mods observe that your contributions are substantial, we will remove the limit.
Thanks and good luck!
(forgive the public moderation strategy – we want to be transparent and accountable to all users and are moving away from DMs where had been sending these messages)
Would a future in which I have crossed a busy road without being knocked down be evidence against the traffic being a threat to my health?
Eliezer’s beliefs IIRC have us at <5% of survival (could be wrong here, just going off memory). I have a lot of uncertainty over both timelines and doom myself, but my guesses are in the range of 30-60% (and these are just the centre two quartiles) chance of something catastrophic in the next 10 years. Beyond that it becomes even more uncertain to me because I am optimistic about a lot of alignment work bar the time constraint.
For what it’s worth, Eliezer’s beliefs tend to be on the more extreme end of pessimistic among alignment researchers. On the other end you have people like Paul.
I wouldn’t use the word extremely, but I do think we’d have to be pretty lucky to make it. (I’m not sure if you’re envisioning a specific future when you say living side by side with AGIs though).
I’m not super clear on what you’re asking, but some general thoughts I had on reading your question: There’s almost always a future in which some scenario comes to pass. There’s probably some Everett branch of the future where the sun doesn’t rise anymore. So in isolation, being able to see a specifically sampled future where some scenario plays out shouldn’t be evidence one way or the other for anything that we don’t consider impossible (if there were a future where the fundamental laws of physics were definitely being violated—like if global entropy kept decreasing—that would be a different story).
A different context is where we see a future randomly sampled across all possible futures. If that future then shows a world where we’re alive and our CEV is achieved, then that would probably be good evidence for updating downward on AGI risk if you have a high P(doom). Someone with a P(doom) of 10% wouldn’t expect to see a future like that with less than 90% certainty (barring other outcomes, but I’m abstracting them away for simplicity, and it shouldn’t affect the central point), so that being the randomly sampled future should only update their estimates by a little (someone else can probably give the exact numbers here), and it’d still be really worth it to work to prevent 10% chance of doom. If you had P(doom) of >50% on the other hand, then this would be stronger evidence to update your priors on, to a degree given by the same equation as if you had P(doom).
If, on the other hand, you were talking about a deterministic future (ignoring quantum considerations, just stuff happening at a macro level), and we could know with some certainty that that future was good—then I’d still ask whether that’s conditional on our working currently to prevent other futures, or whether it was the default outcome. If the former, that means there probably still a strong case for why AGI is dangerous, but we were up to the task of solving it. Concerns about nuclear weapons don’t go away just because safety protocols are sufficient—they probably decrease our practical worries, but the intrinsic concern about their default destructiveness would, and probably should, remain. If it’s the latter, on the other hand, then yeah I agree that we should update down hard on AGI risk. But speculating that way doesn’t seem more useful than as a different framing of priors.