I think they make a few different arguments to address different objections.
A lot of people are like “how would an AI even possibly kill everyone?” and for that you do need to argue for what sort of things a superior intellect could accomplish.
The sort of place where I think they spell out the conditional is here:
The greatest and most central difficulty in aligning artificial superintelligence is navigating the gap between before and after.
Before, the AI is not powerful enough to kill us all, nor capable enough to resist our attempts to change its goals. After, the artificial superintelligence must never try to kill us, because it would succeed.
Engineers must align the AI before, while it is small and weak, and can’t escape onto the internet and improve itself and invent new kinds of biotechnology (or whatever else it would do). After, all alignment solutions must already be in place and work- ing, because if a superintelligence tries to kill us it will succeed. Ideas and theories can only be tested before the gap. They need to work after the gap, on the first try.
I think they make a few different arguments to address different objections.
A lot of people are like “how would an AI even possibly kill everyone?” and for that you do need to argue for what sort of things a superior intellect could accomplish.
The sort of place where I think they spell out the conditional is here:
Yeah fair, I think we just read that passage differently—I agree it’s a very important one though and quoted it in my own (favorable) review
But I read the “because it would succeed” eg as a claim that they are arguing for, not something definitionally inseparable from superintelligence
Regardless, thanks for engaging on this, and hope it’s helped to clarify some of the objections EY/NS are hearing