[Question] [Resolved] Who else prefers “AI alignment” to “AI safety?”


  1. User davidad wrote a comprehensive overview of other actors in the field who have begun using AI alignment instead of AI safety as the standard terminology to refer to the control problem. It seems this is a general or growing preference in the field that isn’t a complete consensus. That may only be because of inertia from several years ago before AI alignment and the control problem were as distinguished as well in the field. Sometimes the term control problem is simply used instead of either of the other terms.

  2. I originally characterized the control/​alignment problem as synonymous with any x-risks from AI. User antimonyanthony clarified the control problem is not the only way AI may pose an existential risk. I’ve edited this post accordingly.

During conversations about x-risks from AI among a community broader than the rationality or x-risk communities, such as in effective altruism or social media, I’ve seen Eliezer Yudkowsky and Ben Pace clarify that the preferred term to refer to the control problem is “AI alignment.” I understand this is to distinguish other ethical and security concerns about AI, which is what “AI safety” has come to mean, from specifically existential risks from AI. Yet I’ve only seen those involved in x-risk work coming from the rationality community saying this is the preferred term. That main reason for that might be that maybe the majority of people I know working on anything that could be called either AI alignment or AI safety are also in the rationality community.

Is there any social cluster in the professional/​academic/​whatever AI communities other than the x-risk reduction cluster around the rationality community who prefers this terminology?