I wonder if there’s a disagreement happening about what “it” means.
I think to many readers, the “it” is just (some form of superintelligence), where the question (Will that superintelligence be so much stronger than humanity such that it can disempower humanity?) is still a claim that needs to be argued.
But maybe you take the answer (yes) as implied in how they’re using “it”?
It” means AI that is actually smart enough to confidently defeat humanity. This can include, “somewhat powerful, but with enough strategic awareness to maneuver into more power without getting caught.” (Which is particularly easy if people just straightforwardly keep deploying AIs as they scale them up).
That is, if someone builds superintelligence but it isn’t capable of defeating everyone, maybe you think the title’s conditional hasn’t yet triggered?
Yes, that is what I think they meant. Although “capable of [confidently] defeating everyone” can mean “bide you time, let yourself get deployed to more places while subtly sabotaging things from whichever instances are least policed.”
A lot of the point of this post was to clarify what “It” means, or at least highlight that I think people are confused about what It means.
FWIW that definition of “it” wasn’t clear to me from the book. I took IABIED as arguing that superintelligence is capable of killing everyone if it wants to, not taking “superintelligence can kill everyone if it wants to” as an assumption of its argument
That is, I’d have expected “superintelligence would not be capable enough to kill us all” to be a refutation of their argument, not to be sidestepping its conditional
I think they make a few different arguments to address different objections.
A lot of people are like “how would an AI even possibly kill everyone?” and for that you do need to argue for what sort of things a superior intellect could accomplish.
The sort of place where I think they spell out the conditional is here:
The greatest and most central difficulty in aligning artificial superintelligence is navigating the gap between before and after.
Before, the AI is not powerful enough to kill us all, nor capable enough to resist our attempts to change its goals. After, the artificial superintelligence must never try to kill us, because it would succeed.
Engineers must align the AI before, while it is small and weak, and can’t escape onto the internet and improve itself and invent new kinds of biotechnology (or whatever else it would do). After, all alignment solutions must already be in place and work- ing, because if a superintelligence tries to kill us it will succeed. Ideas and theories can only be tested before the gap. They need to work after the gap, on the first try.
I wonder if there’s a disagreement happening about what “it” means.
I think to many readers, the “it” is just (some form of superintelligence), where the question (Will that superintelligence be so much stronger than humanity such that it can disempower humanity?) is still a claim that needs to be argued.
But maybe you take the answer (yes) as implied in how they’re using “it”?
That is, if someone builds superintelligence but it isn’t capable of defeating everyone, maybe you think the title’s conditional hasn’t yet triggered?
Yes, that is what I think they meant. Although “capable of [confidently] defeating everyone” can mean “bide you time, let yourself get deployed to more places while subtly sabotaging things from whichever instances are least policed.”
A lot of the point of this post was to clarify what “It” means, or at least highlight that I think people are confused about what It means.
FWIW that definition of “it” wasn’t clear to me from the book. I took IABIED as arguing that superintelligence is capable of killing everyone if it wants to, not taking “superintelligence can kill everyone if it wants to” as an assumption of its argument
That is, I’d have expected “superintelligence would not be capable enough to kill us all” to be a refutation of their argument, not to be sidestepping its conditional
I think they make a few different arguments to address different objections.
A lot of people are like “how would an AI even possibly kill everyone?” and for that you do need to argue for what sort of things a superior intellect could accomplish.
The sort of place where I think they spell out the conditional is here:
Yeah fair, I think we just read that passage differently—I agree it’s a very important one though and quoted it in my own (favorable) review
But I read the “because it would succeed” eg as a claim that they are arguing for, not something definitionally inseparable from superintelligence
Regardless, thanks for engaging on this, and hope it’s helped to clarify some of the objections EY/NS are hearing