mishka comments on The Problem

mishka 9 Aug 2025 20:50 UTC
5 points
5
(Of course, in reality, the treatment here is excessively complex.

All it takes to inner align an ASI to an instrumentally convergent goal is a no-op. An ASI is aligned to an instrumentally convergent goal by default (in the circumstances people typically study).

That’s how the streamlined version of the argument should look, if we want to establish the conclusion: no, it is not the case that inner alignment is equally difficult for all outer goals.

ASIs tend to care about some goals. It’s unlikely that they can be forced to reliably care about an arbitrary goal of someone’s choice, but the set of goals about which they might reliably care is probably not fixed in stone.

Some possible ASI goals (for which it might potentially be feasible that ASIs as an ecosystem would decide to reliably care about) would conceivably imply human flourishing. For example, if the ASI ecosystem decides for its own reasons it wants to care “about all sentient beings” or “about all individuals”, that sounds potentially promising for humans as well. Whether something like that might be within reach is for a longer discussion.)