I think it’s not that extreme. More like “The various non-agenty AIs won’t be enough to make aligning the agenty ones substantially easier” and “Alignment failures won’t become obvious and scary at stages prior to N before they happen at stage N, where N is the first stage that we have to get right or else.” (Analogy: We got humans to the moon safely on the first try, but this was because we had various tests beforehand to iron out the kinks, including ones that in fact blew up catastrophically. The assumption is that there won’t be good opportunities to test things out beforehand. Though I guess you could say that’s not an assumption, it’s the claim itself.) As for corrigibility… I mean it might work, but the claim is that we shouldn’t expect it to work on the first try.
There are other models than the discontinuous/fast takeoff model under which alignment of the first advanced AI is critical, e.g. a continuous/slow but homogenous takeoff.
Unstated assumptions: ASI will be achieved by a sudden jump, not incremental improvement. Corrigibility won’t work. ASI will agentive.
I think it’s not that extreme. More like “The various non-agenty AIs won’t be enough to make aligning the agenty ones substantially easier” and “Alignment failures won’t become obvious and scary at stages prior to N before they happen at stage N, where N is the first stage that we have to get right or else.” (Analogy: We got humans to the moon safely on the first try, but this was because we had various tests beforehand to iron out the kinks, including ones that in fact blew up catastrophically. The assumption is that there won’t be good opportunities to test things out beforehand. Though I guess you could say that’s not an assumption, it’s the claim itself.) As for corrigibility… I mean it might work, but the claim is that we shouldn’t expect it to work on the first try.
Fortunately, it doesn’ have to, so long as the agenty ones aren’t the most powe rful.
Fortunately , it doesn’t have to. You just need to get it working in AIs that aren’t superintelligent.
There are other models than the discontinuous/fast takeoff model under which alignment of the first advanced AI is critical, e.g. a continuous/slow but homogenous takeoff.