If the default ASI we build will know what we mean and not do things like [turn us into paperclips], which is obviously not what we intended—then AI alignment would be no issue. AI misalignment can only exist if the AI is misaligned with our intent. If it is perfectly aligned with the users intended meaning but the user is evil, that’s a separate issue with completely different solutions.
If instead, you mean something more technical, like, it will “know” what we meant but not care, and “do” what we didn’t mean, or that “literal” refers to language parsing, while ASI misalignment will be due to conceptual misspecification—then I agree with you but don’t think that trying to make that distinction will be helpful to a non-technical reader.
It’s not a fact that the default ASI will be aligned,.or even agentive, etc ,etc. It’s no good to appeal to one part of the doctrine to support another , since it’s pretty much all unproven.
If instead, you mean something more technical, like, it will “know” what we meant but not care, and “do” what we didn’t mean, or that “literal” refers to language parsing, while ASI misalignment will be due to conceptual misspecification—then I agree with you but don’t think that trying to make that distinction will be helpful to a non-technical reader.
I mean it’s a bad, misleading argument. The conclusion of a bad argument can still be true...but I don’t think there are any good arguments for high p(doom).
If the default ASI we build will know what we mean and not do things like [turn us into paperclips], which is obviously not what we intended—then AI alignment would be no issue. AI misalignment can only exist if the AI is misaligned with our intent. If it is perfectly aligned with the users intended meaning but the user is evil, that’s a separate issue with completely different solutions.
If instead, you mean something more technical, like, it will “know” what we meant but not care, and “do” what we didn’t mean, or that “literal” refers to language parsing, while ASI misalignment will be due to conceptual misspecification—then I agree with you but don’t think that trying to make that distinction will be helpful to a non-technical reader.
It’s not a fact that the default ASI will be aligned,.or even agentive, etc ,etc. It’s no good to appeal to one part of the doctrine to support another , since it’s pretty much all unproven.
I mean it’s a bad, misleading argument. The conclusion of a bad argument can still be true...but I don’t think there are any good arguments for high p(doom).