[Question] Are we too confident about unaligned AGI killing off humanity?

I agree that AGI is possible to make, that it eventually will become orders-of-magnitude smarter than humans, and that it poses a global risk if the alignment problem is not solved. I also agree that the alignment problem is very hard, and is unlikely to be solved before the first AGI. And I think it’s very likely that the first recursively-self-improving AGI will emerge before 2030.

But I don’t get the confidence about the unaligned AGI killing off humanity. The probability may be 90%, but it’s not 99.9999% as many seem to imply, including Eliezer.

Sure, humans are made of useful atoms. But that doesn’t mean the AGI will harvest humans for useful atoms. I don’t harvest ants for atoms. There are better sources.

Sure, the AGI may decide to immediately kill off humans, to eliminate them as a threat. But there is a very short time period (perhaps in miliseconds) where humans can switch off a recursively-self-improving AGI of superhuman intelligence. After this critical period, humanity will be as much a threat to the AGI as a caged mentally-disabled sloth baby is a threat to the US military. The US military is not waging wars against mentally disabled sloth babies. It has more important things to do.

All such scenarios I’ve encountered so far imply AGI’s stupidity and/​or the “fear of sloths”, and thus are not compatible with the premise of a rapidly self-improving AGI of superhuman intelligence. Such an AGI is dangerous, but is it really “we’re definitely going to die” dangerous?

Our addicted-to-fiction brains love clever and dramatic science fiction scenarios. But we should not rely on them in deep thinking, as they will nudge us towards overestimating the probabilities of the most dramatic outcomes.

Overestimating a global risk is almost as bad as underestimating it. Compare: if you’re 99.99999% sure that a nuclear war will kill you, then the despondency will greatly reduce your chances of surviving the war, because you’ll fail to make the necessary preparations, like acquiring a bunker etc, which could realistically save your life under many circumstances.[1]

The topic of surviving the birth of the AGI is severely under-explored, and the “we’re definitely going to die” mentality seems to be the main cause. A related under-explored topic is preventing the unaligned AGI from becoming misantropic, which should be our second line of defense (the first one is alignment research).

  1. ^

    BTW, despondency is deadly by itself. If you’ve lost all hope, there is a high risk that you’ll not live long enough to see the AGI, be it aligned or not.