I think compassion might be a much better way (or part of one) to frame AI goals than alignment. Alignment is sort of a content-free concept. It doesn’t answer whether AI should be aligned with me, you, Putin, or whomever. Of course, we do care which user goals AI is aligned with, so focusing on alignment seems to end up hiding the ball, in terms of those other goals. It’s like the debate about media objectivity. I would argue that a reporter always has some priors, so often we end up getting the appearance of objectivity rather than the reality. Which is worse than nothing, because now a media source has an incentive to appear objective, which is impossible, leading to deception.
I wonder what other values we want AI to hold, and how we build them to actually hold those values, rather than just appear to.
I think compassion might be a much better way (or part of one) to frame AI goals than alignment. Alignment is sort of a content-free concept. It doesn’t answer whether AI should be aligned with me, you, Putin, or whomever. Of course, we do care which user goals AI is aligned with, so focusing on alignment seems to end up hiding the ball, in terms of those other goals. It’s like the debate about media objectivity. I would argue that a reporter always has some priors, so often we end up getting the appearance of objectivity rather than the reality. Which is worse than nothing, because now a media source has an incentive to appear objective, which is impossible, leading to deception.
I wonder what other values we want AI to hold, and how we build them to actually hold those values, rather than just appear to.