Lisa Feldman Barrett versus Paul Ekman on facial expressions & basic emotions

1. Summary /​ Table of Contents

This post is mostly a book review of How Emotions Are Made: The Secret Life of the Brain by Lisa Feldman Barrett.

Barrett is very interested in arguing against a particular view that she attributes to Paul Ekman, so I also read some of Ekman’s work, including his book Emotions Revealed, and his paper “An argument for basic emotions”.

My assessment is that Barrett is correct that the view she attributes to Ekman is wrong, and Ekman is equally correct that the view he attributes to his intellectual opponents is wrong. But they are directly disagreeing with each other much less than they seem to think they are, and I’ll try to paint a single coherent picture that captures the best parts of both perspectives at once.

(Note for my regular readers: This post has no mention of AI safety or alignment, but I consider it vaguely related to my long-term research project described at this link.)

Table of Contents (with section summaries)

  • Section 2 presents three positions:

    • The “anti-Barrett” position is Barrett’s punching-bag that she has worked tirelessly to refute. She calls it “the classical view of emotions”. It claims (to oversimplify /​ caricature a bit) that everyday emotion concepts like “anger” have a perfect 1-to-1 correspondence with discrete innate behavioral programs.

    • The “anti-Ekman” position is Ekman’s punching-bag that he has worked tirelessly to refute. It claims that facial expressions are 100% social conventions, with no cross-cultural correlation whatsoever.

    • My own position is kind of a compromise. I think we have a bunch of “innate behaviors” (like vomiting, laughing, disgust-reactions, and Duchenne-smiles), associated with cell groups in the hypothalamus and brainstem, and I think these are what Ekman is trying to talk about. And, I think we learn emotion concepts within our lifetimes, just like we learn every other kind of concept, and they wind up stored in our cortex, and these emotion concepts are what Barrett is talking about.

  • Section 3 argues that cross-cultural studies of facial expressions disprove both the anti-Barrett and the anti-Ekman positions, thus nicely explaining why both sides keep declaring victory.

  • Sections 4 & 5 muse on whether Barrett agrees or disagrees with the extreme “anti-Ekman” position, and then whether Ekman agrees or disagrees with the extreme “anti-Barrett” position. In other words, should we think of these positions as strawmen? I wound up thinking that Ekman is mostly not making the mistakes that Barrett attributes to him, but that his writing could stand to be clearer on that. I’m more uncertain about Barrett.

  • Section 6 is a brief conclusion.

2. Three positions

2.1 The “anti-Barrett” position (that Barrett is arguing against)

Barrett seems to think almost everyone, both experts and ordinary people (at least in the English-speaking world), believe something like the following position, and she calls it “The Classical View of Emotion”, and attributes it most especially to Paul Ekman.

(UPDATE: Barrett comments that I am caricaturing—i.e., that the box below is a stronger statement than what she means by “The Classical View of Emotion”. In other words, yes, Barrett disagrees with the claim in the box below, but she would also disagree with certain weaker claims. Sorry. It doesn’t affect anything else in the post though.)

The “anti-Barrett” position:

  • There is a set of discrete, human-universal “basic emotions”—probably the following six: anger, fear, disgust, surprise, sadness, and happiness, although sometimes researchers suggest different lists.

  • Each basic emotion has its own unique “fingerprint” in terms of facial expression, body language, etc.

  • If, in everyday speech, you say “Joe is angry”, then Joe definitely has the “basic emotion” of “anger” right now, and the corresponding facial expression etc.

  • If someone is displaying the facial expression of a basic emotion, everyone else who sees their face will be able to reliably identify that emotion from seeing the face.

  • Given that these “fingerprints” are obvious-to-everyone, discrete, and highly relevant to everyday life, we can safely assume that pretty much every culture has six words /​ concepts corresponding to the six “basic emotions”.

Whoops, sorry about the erroneous text labels on top. The text is supposed to read as follows (left to right): “delirious”, “constipated”, “constipated”, “constipated”, “constipated”, and “constipated”. (Image source)

2.2 The “anti-Ekman” position (that Ekman is arguing against)

If you instead read Paul Ekman, he is mainly interested in arguing against (something like) the following, which he attributes to various people including Margaret Mead, Gregory Bateson, Edward Hall, Ray Birdwhistell, and Charles Osgood. I’ll call it the “anti-Ekman” position.

The “anti-Ekman” position:

  • Facial expressions (and body language etc.) are completely arbitrary, determined by social convention, and different in different cultures. This claim is closely analogous to how the speech sound associated with a particular meaning is by-and-large an arbitrary social convention (leaving aside a few fun exceptions like onomatopoeia and common words being shorter).

  • Example: It is a totally a-priori-arbitrary social convention that, in our culture, the word “million” means 10⁶ and “billion” means 10⁹. In a perfectly remote and linguistically-isolated culture, it could equally well be the other way around. By the same token, it is a totally a-priori-arbitrary social convention that, in our culture, people are statistically more likely to pout after losing a child to disease, and to scowl when about to punch someone. In a perfectly remote and isolated culture, it could equally well be the other way around.

According to the “anti-Ekman” position, it is perfectly plausible that, in some remote or historical culture, this particular facial expression is widely understood to mean “I am just right on the verge of falling asleep, goodnight everyone”.

2.3 What I believe

I think both Ekman and Barrett are correct in criticizing their respective foils, and I endorse the following, which I claim captures all the good insights from both sides:

My own position:

Part 1 [a.k.a. “the Ekman part”]—“Innate behaviors” in the Steering Subsystem (hypothalamus & brainstem):

  • We have a repertoire of probably hundreds of human-universal “innate behaviors”. Examples of innate behaviors presumably include things like vomiting, “disgust reaction(s)”, “laughing”, “Duchenne smile”, and so on.

  • Different innate behaviors are associated with specific, human-universal, genetically-specified groups of neurons in the hypothalamus and brainstem.

  • These innate behaviors can involve the execution of specific and human-universal facial expressions, body postures, and/​or other physiological changes like cortisol release.

  • These innate behaviors have human-universal triggers (and suppressors). But it’s hard to describe exactly what those triggers are, because the triggers are internal signals like “Signal XYZ going into the hypothalamus and brainstem”, not external situations like “getting caught breaking the rules”.

    • There is some statistical relationship between external situations and internal signals. They’re not totally uncorrelated! But they’re not 100% perfectly correlated either! It’s entirely possible for two people in the same situation to react and feel quite differently.

  • In general, innate behaviors are not just “on” or “off”, but rather can be more or less active at any given time. Also, in general, multiple innate behaviors can be active simultaneously—for example, a disgust reaction can be active at the same time as laughing. But there seem to be some mutual inhibition and excitation dynamics that make some combinations more common than others.

Part 2 [a.k.a. “the Barrett part”]—“Emotion concepts” in the Learning Subsystem (cortex, thalamus, etc.):

  • Separately, we also have a bunch of concepts relating to emotion—things like “guilt” and “schadenfreude”, as those words are used in everyday life. These concepts are encoded as patterns of connections in the cortex (including hippocampus) and thalamus, and they form part of our conscious awareness.

  • Emotion concepts typically relate to both innate behaviors and the situation in which those behaviors are occurring. For example, Ekman says “surprise” versus “fear” typically involve awfully similar facial expressions, which raises the possibility that the very same innate behaviors tend to activate in both “surprise” and “fear”, and that we instead distinguish “surprise” from “fear” based purely on context.

  • Conversely, multiple different innate behaviors can get lumped together under the umbrella of a single emotion concept. For example, for all I know, maybe a mouth-closed disgust reaction is one innate behavior, and a mouth-open disgust reaction is a different innate behavior. But if so, the concept “disgust” would involve either of them, and most people would probably be unaware that this division even exists. As a more obvious example, the “anger” concept lumps together instances of cold seething anger, and of white-hot anger, and of righteous indignation, etc.—and none of those instances involve the exact same combination of innate behaviors. Indeed, for all I know, the innate behaviors involved in those three instances might have no overlap whatsoever.

  • For both the above reasons, the relation between emotion concepts and concurrent innate behaviors is definitely not 1-to-1! But it’s definitely not “perfectly uncorrelated” either!

  • Emotion concepts, like all other concepts, are at least somewhat culturally-dependent, and are learned within a lifetime, and play a central role in how we understand and remember what’s going on.

3. Facial expression studies disprove both the “anti-Ekman” and “anti-Barrett” positions

Suppose we travel to a remote isolated culture, in the interest of settling the Barrett versus Ekman dispute. Here are some observations:

  • To falsify the anti-Ekman position, you need to show more than zero correlations between situations and facial expressions. To falsify the anti-Barrett position, you need to show less than perfect correlations between situations and facial expressions. Of course, the actual correlations are more than zero and less than perfect, falsifying both, and allowing both sides of the dispute to declare victory.

  • Suppose we start the test by saying: “There’s a thing I want to talk about—a thing that we in the USA called ‘anger’. Here are 15 different stories of people feeling anger… OK, now look at these two photos and tell me which one is ‘anger’.” Suppose that we find that the subjects do better than chance on picking the photo.

    • If your general goal is to disprove the anti-Ekman position, then this test is perfect. As long as the stories do not specifically mention facial expressions, then any better-than-chance performance on this task would be sufficient to prove that facial expressions are not totally arbitrary social conventions.

    • If your general goal is to prove the anti-Barrett position, then this test is terribly designed. After all, the experimenter is teaching the English concept of “anger” to the participant, rather than testing whether the participant already had a concept that closely corresponds to our “anger”.

  • Suppose instead that our test is to give the subject a stack of face photos and ask them to sort them into piles that go together, without any further instructions. And suppose they do not split the pictures into six piles corresponding to the six basic emotions.

    • If your general goal is to disprove the anti-Barrett position, then this test is perfect.

    • If your general goal is to prove the anti-Ekman position, then this test is terribly designed. After all, even if the anti-Ekman position is wrong (i.e. even if facial expressions have a not-completely-random universal statistical correspondence with what’s going on), then that doesn’t necessarily imply that the study participants consider facial expressions to be the natural salient way to divide up a stack of photos, or that the study participants will categorize facial expressions in a similar way as English speakers do, or even that the study participants are paying any attention whatsoever to facial expressions in the first place.

Accordingly, both camps have visited lots of remote isolated cultures, and reported back that their results prove them right. For example, Ekman has studied the Fore using methods like the second bullet point, and Barrett’s group has studied the Himba using methods like the third bullet point.

4. Does Barrett agree or disagree with the extreme “anti-Ekman” position?

Short answer: I’m not sure.

I will say that, if she disagrees with that position, then she sure seems remarkably uninterested in discussing the fact that she disagrees with it!

In my opinion, evidence against the anti-Ekman position is overwhelming. As above, Ekman’s preferred study methodologies are adequate to disprove the anti-Ekman position, and have done so dozens of times. And indeed, Barrett’s own study methodologies have disproved the anti-Ekman position over and over as well!

Or just consider the obvious fact that blind babies sometimes smile, without being coached to do so, and they do so in situations that fit our common sense about smiling.

Barrett comes close to conceding the case of smiling, but then backs off:

Regardless of the experimental method used, people in numerous cultures agree that smiling faces and laughing voices express happiness. So “Happy” might be the closest thing we have to a universal emotion category with a universal expression. Or it might not. For one thing, “Happiness” is usually the only pleasant emotion category that is tested using the basic emotion method, so it’s trivial for subjects to distinguish it from the negative categories. And consider this fun fact: the historical record implies that ancient Greeks and Romans did not smile spontaneously when they were happy. The word “smile” doesn’t even exist in Latin or Ancient Greek. …Perhaps sometime in the last few hundred years, smiling became a universal, stereotyped gesture symbolizing happiness. Or…perhaps smiling in happiness is simply not universal.

(Her remarkable claim about Greeks and Romans cites this book by classics scholar Mary Beard. You can find a rebuttal to Mary Beard’s claim by Roland Mayer here. I am extremely unfit to adjudicate which of these two esteemed classics scholars is correct—Mayer’s argument seems to be a devastating critique, but I can’t find any re-response by Beard, and I don’t want to be unduly biased by the fact that Mayer had the last word here. My money is strongly on Mayer, but that’s just based on priors.)

4.1 Barrett seems to endorse the existence of human-universal innate behaviors

Interestingly, Barrett believes that there are at least some human-universal innate behaviors—in this interview (1:55:10) she mentions vomiting, freezing, running, and fainting, for example.

I would ask Barrett: If vomiting can be a human-universal innate behavior, why not laughing? And if laughing can be a human-universal innate behavior, why not the Duchenne smile microexpression? I imagine her rejecting the latter, and maybe also the former, but I’m not sure on what grounds.

4.2 Barrett might agree with universality-through-functionality?

Another intriguing possibility for what Barrett might be thinking comes from this throwaway comment:

…If you’re uncertain whether a person directly in front of you could harm you, you might narrow your eyes to see the person’s face better. If danger is potentially lurking around the next corner, your eyes might widen to improve your peripheral vision…

We might extrapolate this kind of thinking into a functional theory of facial expressions. Such a theory might say:

OK fine, the anti-Ekman position is false, but that’s just because we all have structurally-similar faces, and certain ways of contorting one’s face tend to be useful for corresponding purposes (that are not arbitrary social conventions). For example, maybe the thing we might call a “disgust facial expression” is just objectively the best way to eject stuff from the mouth and nose that shouldn’t be there—a useful activity for any human. So it’s no surprise that we find that kind of expression recurring across cultures!

Again, I’m not sure what Barrett really thinks. Does she subscribe to this theory at all? If so, does she think that the matching-of-facial-expressions-to-useful-functions occurs via within-lifetime learning, or via evolution, or some combination of both? Does she think this theory applies to all facial expressions, or just some? And which ones? I don’t know.

(My own opinion is: This kind of “functional” theory is probably true for many “innate behaviors” in the Steering Subsystem (hypothalamus & brainstem), for which we can thank evolutionary learning; and it’s also probably true for many learned behaviors stored in the Learning Subsystem (cortex, striatum, etc.), for which we can thank within-lifetime learning. But it’s also likely that some behaviors in both categories are at least partly a-priori-arbitrary communicative signals.)

4.3 And what’s going on anyway with the fact that Barrett is relentlessly focused on emotion concepts, to the exclusion of everything else?

To explain what I’m suggesting by this section header, I propose to draw an analogy with something that’s easier to think about: anatomy. Consider:

  • The large-scale structural anatomy of humans is universal across cultures.

  • The concepts used to describe this anatomy are not (perfectly) universal across cultures.

For example, this paper reports the major anatomical terms used by an Amazonian forest culture. Alongside familiar terms like “ëinja” = “hand”, you also find things like “eeja” = “middle thoracic to upper lumbar back”, and “ēinjatepu” = “that bump on the side of the palm of your hand caused by your thumb muscles”. (OK, fine, we English-speakers do actually have a term for that bump. It’s just very obscure—“thenar eminence”.)

Congratulations: You have just read a paragraph on the topic of “anatomical concepts”. Want more? Hoo boy, I can talk about anatomical concepts all day. I can talk about how people learn /​ acquire anatomical concepts, and I can talk about how your vocabulary of anatomical concepts affects your anatomy-related perceptions and memories, and I can talk about whether anatomical concepts are “real” in some philosophical sense, and I can talk about the math of latent variables in ML models and of clusters in high-dimensional spaces, etc.

Now, in the case of anatomy, it’s perfectly obvious to everyone that

  1. Anatomy is an interesting thing that we can talk about;

  2. Anatomical concepts are also an interesting thing that we can talk about;

  3. These two things are not the same. And if whenever I’m trying to talk about 1, you change the subject to 2, then that’s annoying and please stop.

(For example, if an anatomy textbook had a chapter on hand bones, but every page it kept bringing up the fact that anatomical concepts are learned, and they’re different in different cultures, and did you know that one concept can refer to more than one bone, and in what sense are these concepts “real”? etc. etc.—then that would be a really annoying and ineffective way for the textbook to teach me about hand bones!!)

By the same token, I would say:

  1. Facial expressions, body movements, physiological arousal, positive and negative valence, etc., are interesting things that we can talk about;

  2. Emotion concepts are also an interesting thing that we can talk about;

  3. These two things are not the same. And if whenever I’m trying to talk about 1, you change the subject to 2, then that’s annoying and please stop.

Barrett does not seem to have that attitude. Instead, I find that she relentlessly steers every conversation towards emotion concepts.

Why?

The mundane possibility is that emotion concepts are her hobby horse, and the subject of her book, and the main topic that she considers to be under dispute. So of course she’s inclined to change the subject to that.

A more interesting possibility is that maybe she would reject that framing above, and say instead that centering the discussion around emotion concepts is the only way to say anything sensible about facial expressions, body movements, and so on.

On what grounds might she say that? Here’s an argument:

Your large-scale structural anatomy stays the same, regardless of how you think about it. For example, no matter how you conceptualize your thenar eminence, or even if you’ve never thought about your thenar eminence at all before reading this blog post, it doesn’t change anything whatsoever about your actual thenar eminence. In other words, you are by-and-large a passive observer of your large-scale structural anatomy.

Emotion concepts are disanalogous to that. If you have learned from culture that people respond to the death of a loved one by getting sleepy, that might become a self-fulfilling prophecy. You are not purely a passive observer of your facial expressions, physiological arousal, etc.—and that includes not only voluntary actions but even “involuntary” things like facial expressions, crying, etc.

Now, on my models, as in Section 2.3 above, there are “innate behaviors” implemented by cell groups in the hypothalamus and brainstem, and there are also predictive models in your cortex, amygdala, striatum etc. that involve concepts /​ categories. There’s a funny loopy thing, where the innate behaviors can provide “ground truth” that trains the various predictive models, but also, those same predictive models can provide a subset of the inputs to those “innate behaviors”—allowing, for example, self-fulfilling prophecies where you cry because your brain has learned through experience that this is a situation where you’re about to cry. For example, see my discussion of so-called “defer-to-predictor mode” here. So anyway, on my models, I can acknowledge that concepts /​ predictive models are relevant to “innate behaviors”, but I still see innate behaviors as a pretty self-contained topic of discussion. But I’m getting off-topic.

Barrett’s discussion is instead quite different from mine, and based on Active Inference theory—and this discussion fills a large fraction of her book. I found it to be pretty confused and incoherent in general, but only to the extent that I find every discussion of Active Inference to be pretty confused and incoherent in general. I have nothing against her discussion in particular. See for example Section 7 of my post Why I’m not into the Free Energy Principle. I won’t elaborate on that here.

5. Does Ekman agree or disagree with the extreme “anti-Barrett” position?

I think he probably mostly disagrees with it. (But I think he could stand to be clearer in his writing on the topic!)

For example:

  • Ekman definitely does not see every instance of “anger” as exactly the same thing, regardless of whether it’s cold seething rage, white-hot anger, righteous indignation, etc.

  • Ekman is definitely aware that people exercise voluntary control over their faces in culturally-dependent ways. He talks about that extensively, and that’s how he came to be so focused on involuntary microexpressions.

  • Ekman is also definitely aware that most people are not perfect at extracting useful information from people’s facial expressions. Otherwise he wouldn’t be selling facial expression recognition training programs, one would assume!

Anyway, when I read Ekman, I get the impression that he’s trying to talk about “innate behaviors”, not “emotion concepts” (see Section 2.3 above). And insofar as that’s true, the stuff he says about them is generally correct. But he’s certainly not crystal-clear in communicating that that’s what he’s doing.

EDITED TO ADD: Here is my attempt to “steelman” Ekman—to say the things that he seems to want to say, in a way that is defensible, where maybe even Barrett would agree with it. Note that I do not claim that Ekman would endorse this passage, and I am definitely not trying to send a vibe of “Yay Ekman, he was right all along”, nor the opposite.

There is an innate behavior that we can call “Duchenne smile”. It is a human-universal. Sometimes people use their voluntary facial muscle control to suppress it, but often they don’t, and even if they do, it’s still there as a microexpression.

The Duchenne smile is in the same category as vomiting—a human-universal innate behavior.

The Duchenne smile happens under certain nonrandom circumstances, thanks to brain wiring specified by the genome. And we have some interoceptive access to whether the Duchenne smile is happening or not—not only can we feel our own facial muscles, and see how our own eyes are squinting etc., but it also seems to strongly correlate with various signals for which we have direct interoceptive access.

Therefore, if you pay enough attention, including maybe practicing with a mirror, then you can gradually cultivate an awareness of whether or not you are Duchenne smiling. And then, if you continue paying careful attention, you can gradually develop a highly-refined sense of the circumstances under which a Duchenne-smile does or doesn’t occur in yourself. And you should consider actually doing that! Because by doing so, you get to be in better touch with your own state-of-mind.

(Once you have cultivated a refined sense of “I am Duchenne-smiling right now”, we can call that sense an “emotion”, or a “feeling”. Or we can not. Call it whatever you want, I don’t care.)

Likewise, other people Duchenne-smile too. If you really try hard to pay attention to that, especially by noticing microexpressions, then you can potentially get to be in better touch with other people’s states-of-mind. This isn’t as helpful as it might initially sound, because if someone is momentarily Duchenne-smiling, then you don’t know why, and in particular, it might have nothing to do with their external circumstances, what they’re doing, etc. Maybe a pleasant memory from last week randomly popped into their head. But it’s not entirely unhelpful either, especially if you’re observing them over the course of a long conversation, and if you’ve had a lot of practice.

OK, all that was Duchenne-smiling—just one example. But there are lots of other things like that too. Various types of pouts, scowls, eyes getting wide or narrow, certain types of body language, etc. All these are associated with learnable patterns, and by attending to them, you can better know yourself and better know others.

And these same patterns, once learned, would be equally helpful in communicating across cultural gaps, because every human has the same repertoire of universal innate behaviors like Duchenne-smiling. If you see someone displaying a Duchenne smile microexpression in a remote isolated community, that’s an interesting bit of data about what’s going on in their head. You may still be deeply confused about what they’re thinking and why, but you have nevertheless learned a nonzero amount about their mental state at that moment. It’s not so different from if you see someone from a remote isolated community vomiting. You don’t know exactly why they’re vomiting, and you don’t know how vomiting fits into their worldview, but you do know what vomiting is and something about the circumstances under which it occurs.

6. Conclusion

I think the “anti-Barrett position” and the “anti-Ekman position” of Sections 2.1-2.2 above are both wrong, and I think Barrett and Ekman respectively are doing everyone a service by explaining why. At the same time, I hope that the discourse can eventually get to a point where everyone treats these two positions as the patently-absurd strawman beliefs that they deserve to be, and moves forward to more interesting issues.

I offer my position of Section 2.3 as a starting point for this moving-forward process. I claim that I have incorporated all the correct insights and criticisms from both sides, and that my position is consistent with everything we know about psychology and neuroscience. Happy to discuss that more in the comments section or by email. :)