Or “Alignment, a cliff we either clear or plummet from.”
But I’m not sure I’m convinced by this framing—my main objection is that it’s not clear the understanding of researchers is going to be increasing fast enough to avoid gradual disempowerment or irreversibly messed up futures. But I think that’s because I’m more skeptical of deceptive alignment than you.
Or “Alignment, a cliff we either clear or plummet from.”
But I’m not sure I’m convinced by this framing—my main objection is that it’s not clear the understanding of researchers is going to be increasing fast enough to avoid gradual disempowerment or irreversibly messed up futures. But I think that’s because I’m more skeptical of deceptive alignment than you.