There’s also a great bit towards the end that helps to explain two confusing stylized facts: humans don’t seem to have much speech-specific hardware that other primates lack, but we’re better at language, and the theory of language evolving to support group coordination requires a lot of activation energy. But if language actually started out one-on-one, between mothers and infants, that neatly solves both problems.
The bit towards the end by Yuye (emphasis mine):
The hardest thing to explain about humans, given that their brains underwent no structural innovation, is language.
(Our plausible range for language is 100-500K years ago. Modern humans exhibit about the same language proficiencies and diverged ~100K years ago, which is also when symbology like cave art show up. Before 500K the larynx and vocal cords weren’t adapted to vocal language.)
Apes can be taught sign language (since they’re physically not able to speak as we do), and there are multiple anecdotes of apes recombining signs to say new things. But they never surpass a young human child. How are we doing that? What’s going on in the brain?
Okay, sure, we’ve heard of Broca’s area and Wernicke’s area. They’re in the middle of the primate mentalizing regions. But chimps have those same areas, wired in the same ways. Plus, children with their entire left hemisphere (where those regions usually live) removed can still learn language fine.
If not a specific region, then what? The human ability to do this probably comes not from a cognitive advancement (although it can’t hurt that our brains are three times bigger than chimps’) but rather tweaks to developmental behavior and instincts.
Here are two things about human children that are not true of chimp children:
At 4 months, they engage in proto-conversation, taking turns with their parents in back-and-forth vocalizations. At 9 months, they start doing “joint attention to objects”: pointing at things and wanting the parent to look at the object, or looking at what their mom is pointing at and interacting with it. (You can see that if language arose as a mother-child activity that improved the child’s tool use, there’s no need to lean on group selection to explain its evolutionary advantage.)
Chimps don’t do either. They do gaze following, yes, but they don’t crave joint attention like human children. And what does a human parent do when they achieve joint attention? They assign labels to the object.
To get a chimp to speak language, it would help to beef up their brain, but this wouldn’t be enough – you’d have to change their instincts to engage in childhood play that is ‘designed’ for language acquisition. The author’s conclusion:
There is no language organ in the human brain, just as there is no flight organ in the bird brain. Asking where language lives in the brain may be as silly as asking where playing baseball or playing guitar lives in the brain. Such complex skills are not localized to a specific area; they emerge from a complex interplay of many areas. What makes these skills possible is not a single region that executes them but a curriculum that forces a complex network of regions to work together to learn them.
So this is why your brain and a chimp brain are practically identical and yet only humans have language. What is unique in the human brain is not in the neocortex; what is unique is hidden and subtle, tucked deep in older structures like the amygdala and brain stem. It is an adjustment to hardwired instincts that makes us take turns, makes children and parents stare back and forth, and that makes us ask questions.
This is also why apes can learn the basics of language. The ape neocortex is eminently capable of it. Apes struggle to become sophisticated at it merely because they don’t have the required instincts to learn it. It is hard to get chimps to engage in joint attention; it is hard to get them to take turns; and they have no instinct to share their thoughts or ask questions. And without these instincts, language is largely out of reach, just as a bird without the instinct to jump would never learn to fly.
As weak indirect evidence that the major difference is about language acquisition instinct, not language capability: Homo floresiensis underwent a decrease in brain and body size in their island environment (until their brains were comparable in size to chimpanzees’), but they kept manufacturing stone tools that may have required language to pass on.
I feel like this quickly glosses over the hypothesis that gestural language evolved first, or that they evolved simultaneously with significantly more sophisticated gestural behavior evolving earlier. I believe gestural language is much older than 500 ka (up to, let’s say, 2 Ma), which is consistent with the fossil evidence on vocalization adaptations.
It’s undeniable that some of the cognitive changes that occurred during human evolution affected motivation; in fact, in my view, I think proto-curiosity and proto-patience would have been favored by selection quite early. On the other hand, in my view, sustainable, scalable joint attention and behaviorally modern imitation learning (e.g. overimitation) are more complex and would have required more than just motivational changes. In particular, I don’t believe that most of the linguistic capability gap between chimps and humans can be explained as ‘motivational hobbling.’
F5 in Old World monkeys is very likely homologous to Broca’s area in humans, and although the gross neuroanatomy of humans and nonhuman primates is highly conserved, there are notable differences between the fine neuroanatomy of F5 in macaques and Broca’s area. Chimp F5 has intermediate features, but the evidence here is limited since we don’t do single-cell recordings in great apes anymore.
My own explanation for why there does not appear to be a derived gross language organ in humans is that F5 and Broca’s area both generate and interpret hierarchical act strings as such. Such a scheme would have several continuous parameters responsive to selection, including hierarchy depth, hierarchy breadth, goal maintenance duration and goal switching speed. I think at various scales this system is general enough to generate and interpret (i.e. socially learn) act strings for flintknapping, gestural and vocal language, controlled fire use, etc. I think this explains why chimps can also learn to knap, but their tools are worse than habilis, and I think it also explains many of the specific linguistic limitations observed in apes using sign and lexigrams.
Interesting take on language evolution in humans by Max Bennett from his book A Brief History of Intelligence, via Sarabet Chang Yuye’s review via Byrne Hobart’s newsletter. Hobart caught my eye when he wrote (emphasis mine)
The bit towards the end by Yuye (emphasis mine):
I feel like this quickly glosses over the hypothesis that gestural language evolved first, or that they evolved simultaneously with significantly more sophisticated gestural behavior evolving earlier. I believe gestural language is much older than 500 ka (up to, let’s say, 2 Ma), which is consistent with the fossil evidence on vocalization adaptations.
It’s undeniable that some of the cognitive changes that occurred during human evolution affected motivation; in fact, in my view, I think proto-curiosity and proto-patience would have been favored by selection quite early. On the other hand, in my view, sustainable, scalable joint attention and behaviorally modern imitation learning (e.g. overimitation) are more complex and would have required more than just motivational changes. In particular, I don’t believe that most of the linguistic capability gap between chimps and humans can be explained as ‘motivational hobbling.’
F5 in Old World monkeys is very likely homologous to Broca’s area in humans, and although the gross neuroanatomy of humans and nonhuman primates is highly conserved, there are notable differences between the fine neuroanatomy of F5 in macaques and Broca’s area. Chimp F5 has intermediate features, but the evidence here is limited since we don’t do single-cell recordings in great apes anymore.
My own explanation for why there does not appear to be a derived gross language organ in humans is that F5 and Broca’s area both generate and interpret hierarchical act strings as such. Such a scheme would have several continuous parameters responsive to selection, including hierarchy depth, hierarchy breadth, goal maintenance duration and goal switching speed. I think at various scales this system is general enough to generate and interpret (i.e. socially learn) act strings for flintknapping, gestural and vocal language, controlled fire use, etc. I think this explains why chimps can also learn to knap, but their tools are worse than habilis, and I think it also explains many of the specific linguistic limitations observed in apes using sign and lexigrams.