Vladimir_Nesov comments on Contra Collier on IABIED

Vladimir_Nesov 20 Sep 2025 19:39 UTC
2 points
0

For the purpose of my argument, there is no essential distinction between ‘the first AGI’ and ‘the first ASI’.

For the purpose of my response there is no essential distinction there either, except perhaps the book might be implicitly making use of the claim that building an ASI is certainly a “sufficiently critical try” (if something weaker isn’t already a “sufficiently critical try”), which makes the argument more confusing if left implicit, and poorly structured if used at all within that argument rather than outside of it.

The argument is still not that there is only one chance to align an ASI (this is a conclusion, not the argument for that conclusion). The argument is that there is only one chance to align the thing that constitutes a “sufficiently critical try”. A “sufficiently cricial try” is conceptually distict from “ASI”. The premise of the argument isn’t about a level of capability alone, but rather about lack of control over that level of capability.

One counterargument is to reject the premise and claim that even ASI won’t constitute a “sufficiently critical try” in this sense, that is even ASI won’t successfully take control over the world if misaligned. Probably because by the time it’s built there are enough checks and balances that it can’t (at least individually) take over the world if misaligned. And indeed this seems to be in line with the counterargument you are making. You don’t expect there will be lack of control, even as we reach ever higher levels of capability.

Nonetheless, there is no special point at which we develop ‘the superhuman’. There is no singular ‘it’ to build, which then proceeds to take over the world in one swift action.

Thus there is no “sufficiently critical try” here. But if there were, it would be a problem, because we would have to get it right the first time then. Since in your view there won’t be a “sufficiently critical try” at all, you reject the premise, which is fair enough.

Another counterargument would be to say that if we ever reach a “sufficiently critical try” (uncontainable lack of control over that level of capability if misaligned), by that time getting it right the first time won’t be as preposterous anymore as it is for the current humanity. Probably because with earlier AIs there will be a lot of more effective cognitive labor and institutions around to make it work.