Generally, I think this is a more accurate picture of what alignment will look like than a 0% or 100% in terms of “did we successfully align?”.
More specifically, I find it more useful to think of the outcomes as distributional—some will be worse and some will be better. If we can accept that premise to start, it’s easier to try and shift the distribution further in our favour than to reject any such efforts because it’s not a 100% solution.
Why give the agents “goals to make true in the real world” at all?
https://www.lesswrong.com/posts/5hApNw5f7uG8RXxGS/the-open-agency-model
Generally, I think this is a more accurate picture of what alignment will look like than a 0% or 100% in terms of “did we successfully align?”.
More specifically, I find it more useful to think of the outcomes as distributional—some will be worse and some will be better. If we can accept that premise to start, it’s easier to try and shift the distribution further in our favour than to reject any such efforts because it’s not a 100% solution.