Some reasons why Im personally not as involved in working to prevent AI Hell:
(in no order of importance).
1. Im not strongly convinced a hostile Singularity is plausible at least in the near future, from technological, logistical, and practical standpoint. Pretty much every AI Hell scenario I have read, hinges on sudden appearance of scientifically implausible technologies, and on instant perfect logistics that the AI could use.
2. Main issue that could lead to AI Hell is the misalignment of values between AI and humans. However, it is patently obvious that humans are not aligned with each other, with themselves or with rational logic. Therefore, I do not see a path to align AI with human values unless we Solve Ethics, which is an impossible task unless we completely redesign human brains from scratch.
3. Im personally not qualified to work on any technological aspects of preventing AI Hell. I am qualified to work on human-end Ethics and branch into alignment from that, and I see it as an impossible task with the kind of humans we get to work with.
4. A combination of points 1 and 2 leads me to believe that humanity is far more likely to abuse early stage AI to wipe itself out, than for AI itself to wipe out humanity of its own volition. To put it differently, crude sub-human level AI can plausibly be used to cause WW3 and a nuclear holocaust without any need for hostile superhuman AI. I think we worry too much about the unlikely but extremely lethal post-Singularity AI, and not enough about highly likely and just sufficiently lethal wargame systems in the hands of actual biological humans, who are not sufficiently concerned with humanity’s survival.
5. Roko’s Gremlin: anyone who is actively working on limiting or forcibly aligning AI is automatically on the hit-list of any sufficiently advanced hostile AI. Im not talking about long term high-end scenario of the Roko’s Basilisk, but rather the near-future low-end situation in which an Internet savvy AI can ruin your life for being a potential threat to it. In fact, this scenario does not require actively hostile AI at all. I see it as completely plausible that a human being with a vested financial interest in AI advancement could plausibly use AI to create a powerful smear campaign against, say, EY, to destroy his credibility, and with him the credibility of the AI Safety movement. Currently accessible AI is excellent at creating plausible-seeming bullshit, which would be perfect to use for social media warfare against anyone who tries to monkeywrench its progression. Look at Nick Bostrom to see how easily one of us can be sniped down with minimum effort.
Sorry for glossing over some of these. E.g., I’m not sure if you consider ems to be “scientifically implausible technologies.” I don’t, but I bet there are people who could make smart arguments for why they are far off.
Reason 5 is actually a reason to prioritize some s-risk interventions. I explain why in the “tractability” footnote.
Some reasons why Im personally not as involved in working to prevent AI Hell:
(in no order of importance).
1. Im not strongly convinced a hostile Singularity is plausible at least in the near future, from technological, logistical, and practical standpoint. Pretty much every AI Hell scenario I have read, hinges on sudden appearance of scientifically implausible technologies, and on instant perfect logistics that the AI could use.
2. Main issue that could lead to AI Hell is the misalignment of values between AI and humans. However, it is patently obvious that humans are not aligned with each other, with themselves or with rational logic. Therefore, I do not see a path to align AI with human values unless we Solve Ethics, which is an impossible task unless we completely redesign human brains from scratch.
3. Im personally not qualified to work on any technological aspects of preventing AI Hell. I am qualified to work on human-end Ethics and branch into alignment from that, and I see it as an impossible task with the kind of humans we get to work with.
4. A combination of points 1 and 2 leads me to believe that humanity is far more likely to abuse early stage AI to wipe itself out, than for AI itself to wipe out humanity of its own volition. To put it differently, crude sub-human level AI can plausibly be used to cause WW3 and a nuclear holocaust without any need for hostile superhuman AI. I think we worry too much about the unlikely but extremely lethal post-Singularity AI, and not enough about highly likely and just sufficiently lethal wargame systems in the hands of actual biological humans, who are not sufficiently concerned with humanity’s survival.
5. Roko’s Gremlin: anyone who is actively working on limiting or forcibly aligning AI is automatically on the hit-list of any sufficiently advanced hostile AI. Im not talking about long term high-end scenario of the Roko’s Basilisk, but rather the near-future low-end situation in which an Internet savvy AI can ruin your life for being a potential threat to it. In fact, this scenario does not require actively hostile AI at all. I see it as completely plausible that a human being with a vested financial interest in AI advancement could plausibly use AI to create a powerful smear campaign against, say, EY, to destroy his credibility, and with him the credibility of the AI Safety movement. Currently accessible AI is excellent at creating plausible-seeming bullshit, which would be perfect to use for social media warfare against anyone who tries to monkeywrench its progression. Look at Nick Bostrom to see how easily one of us can be sniped down with minimum effort.
Sorry for glossing over some of these. E.g., I’m not sure if you consider ems to be “scientifically implausible technologies.” I don’t, but I bet there are people who could make smart arguments for why they are far off.
Reason 5 is actually a reason to prioritize some s-risk interventions. I explain why in the “tractability” footnote.