[Question] What precisely do we mean by AI alignment?

We sometimes phrase AI alignment as the problem of aligning the behavior or values of AI with what humanity wants or humanity’s values or humanity’s intent, but this leaves open the questions of just what precisely it means for an AI to be “aligned” with just what precisely we mean by “wants,” “values,” or “intent”. So when we say we want to build aligned AI, what precisely do we mean to accomplish beyond vaguely building an AI that does-what-I-mean-not-what-I-say?