ryan_greenblatt comments on The title is reasonable

ryan_greenblatt 22 Sep 2025 1:22 UTC
18 points
3

I mean, I also believe that if we solve the alignment problem, then we will no longer have an alignment problem, and I predict the same is true of Nate and Eliezer.

By “superintelligence” I mean “systems which are qualititatively much smarter than top human experts”. (If Anyone Builds It, Everyone Dies seems to define ASI in a way that could include weaker levels of capability, but I’m trying to refer to what I see as the typical usage of the term.)

Sometimes, people say that “aligning superintelligence is hard because it will be much smarter than us”. I agree, this seems like this makes aligning superintelligence much harder for multiple reasons.

Correspondingly, I’m noting that if we can align earlier systems which are just capable enough to obsolete human labor (which IMO seems way easier than directly aligning wildly superhuman systems), these systems might be able to ongoingly align their successors. I wouldn’t consider this “solving the alignment problem” because we instead just aligned a particular non-ASI system in a non-scalable way, in the same way I don’t consider “claude 4.0 opus is aligned enough to be pretty helpful and not plot takeover” to be a solution to the alignment problem.

Perhaps your view is “obviously it’s totally sufficient to align systems which are just capable enough to obsolete current human safety labor, so that’s what I meant by ‘the alignment problem’”. I don’t personally think this is obvious given race dynamics and limited time (though I do think it’s likely to suffice in practice). Minimally, people often seem to talk about aligning ASI (which I interpret to mean wildly superhuman AIs rather than human-ish level AIs).