I’m not sure where Scott is going with this series, but I seem to have a different reaction to the excerpts from Henrich than most (but not all) of the commenters before me: rather than coming across as persuasive, I wouldn’t trust him as far as I could throw him.
For simplicity let’s concentrate on the seal hunting description. I don’t know enough about Inuit techniques to critique the details, but instead of aiming for a fair description, it’s clear that Henrich’s goal is to make the process sound as difficult to achieve as possible. But this is just slight of hand: the goal of the stranded explorer isn’t to reproduce the exact technique of the Inuit, but to kill seals and eat them. The explorer isn’t going to use caribou antler probes or polar bear harpoon tips — they are going to use some modern wood or metal that they stripped from their ice bound ship.
Then we hit “Now you have a seal, but you have to cook it.” What? The Inuit didn’t cook their seal meat using a soapstone lamp fueled with whale oil, they ate it raw! At this point, Henrich is not just being misleading, he’s making it up as he goes along. At this point I start to wonder if part about the antler probe and bone harpoon head are equally fictional. I might be wrong, but beyond this my instinct is to doubt everything that Henrich argues for, even if (especially if) it’s not an area where I have familiarity
Going back to the previous post on “Epistemic Learned Helplessness”, I’m surprised that many people seem to have the instinct to continue to trust the parts of a story that they cannot confirm even after they discover that some parts are false. I’m at the opposite extreme. As soon as I can confirm a flaw, I have trouble trusting anything else the author has to say. I don’t care about the baby, this bathwater has to go! And if the “flaw” is that the author is being intentionally misleading, I’m unlikely to ever again trust them (or anyone else who recommends them). .
Probably I accidentally misrepresented a lot in the parts that were my own summary. But this is from a direct quote, and so not my fault.
roystgnr adds:
Wikipedia seems to suggest that they ate freshly killed meat raw, but cooked some of the meat brought back to camp using a Kudlik, a soapstone lamp fueled with seal oil or whale blubber. Is that not correct? That would still flatly contradict “but you have to cook it”, but it’s close enough that the mistake doesn’t reach “making it up as he goes along” levels of falsehood. You’re correct that even the true bits seem to be used for argument in a misleading fashion, though.
This seems within the level of simplifying-to-make-a-point that I have sometimes been guilty of myself, so I’ll let it pass.
While we’re at it, one example I learned afterwards was that the ‘caribou randomization’ story is probably bogus (excerpts):
We will show that hunters do not randomize their behavior, that caribou populations do not fluctuate according to human predation, and that scapulimancy apparently is not selected because it is ecologically advantageous. We shall also show that there is no cross-cultural evidence of divinatory random devices producing randomized subsistence behavior, but rather that people manipulate divination with the explicit or implicit intervention of personal choice.
What is particularly interesting to me is that the apparent beautiful match of this traditional hunting practice with contemporary game theory may be ‘too good to be true’ because it was actually the opposite: I suspect that the story was made up to launder (secret) game-theoretic work from WWII into academic writing; the original author’s career & funder are exactly where that sort of submarine-warfare operations-research idea would come from… (There were many cases post-WWII of civilians carefully laundering war or classified work into publishable form, which means that any history-of-ideas has to be cautious about taking at face value anything published 1940–1960 which looks even a little bit like cryptography, chemistry, physics, statistics, computer science, game theory, or operations research.)
While he is probably directionally correct, I think it should be way weaker than people think, and I now believe that something like the human body being optimized for tool use/the scaling hypothesis applied to evolution is my broad explanation of how humans have rocketed away from animals, and I now think cultural learning is way, way less powerful in humans than people thought, and this is partially updating from LLMs, which have succeeded but so far have nowhere near the capabilities that humans have (yet), and there’s an argument to be made that cultural learning was the majority of AI capabilities before o1/o3.
How do LLMs and the scaling laws make you update in this way? They make me update in the opposite direction. For example, I also believe that the human body is optimized for tool use and scaling, precisely because of the gene-culture coevolution that Henrich describes. Without culture, this optimization would not have occurred. Our bodies are cultural artifacts.
Cultural learning is an integral part of the scaling laws; the scaling laws show that indefinitely scaling the number of parameters in a model doesn’t quite work; the training data also has to scale, with the implication that that data is some kind of cultural artifact, where the quality of that artifact determines the capabilities of the resulting model. LLMs work because of the accumulated culture that goes into them. This is no less true for “thinking” models like o1 and o3, because the way they think is very heavily influenced by the training data. The fact that thinking models do so well is because thinking becomes possible at all, not because thinking is something inherently beyond the training data. These models can think because of the culture they absorbed, which includes a lot of examples of thinking. Moreover, the degree to which Reinforcement Learning determines the capabilities of thinking models is small compared to Supervised Learning, because, firstly, less compute is spent on RL than on SL, and, secondly, RL is much less sample-efficient than SL.
How do LLMs and the scaling laws make you update in this way? They make me update in the opposite direction. For example, I also believe that the human body is optimized for tool use and scaling, precisely because of the gene-culture coevolution that Henrich describes. Without culture, this optimization would not have occurred. Our bodies are cultural artifacts.
I think the crux is that the genetic changes come first, then the cultural changes come after, and the reason is the modern human body plan happened around 200,000 BC, and arguably could be drawn even earlier, whereas the important optimizations that culture introduced came over 100,000 years at the latest.
That said, fair enough on calling me out on the scaling laws point, I was indeed not able to justify my update.
As far as why I’m generally skeptical of cultural learning mattering nearly as much as Heinrich, it’s because I believe a lot of the evidence for this fundamentally doesn’t exist for human cultures, and a lot of work in the 1940s-1960s around lots of subjects that were relevant to cultural learning were classified and laundered to produce simulated results:
What do you make of feral children like Genie? While there are not many counterfactuals to cultural learning—probably mostly because depriving children of cultural learning is considered highly immoral—feral children do provide strong evidence that humans that are deprived of cultural learning do not come close to being functional adults. Additionally, it seems obvious that people who do not receive certain training, e.g., those who do not learn math or who do not learn carpentry, generally have low capability in that domain.
the genetic changes come first, then the cultural changes come after
You mean to say that the human body was virtually “finished evolving” 200,000 years ago, thereby laying the groundwork for cultural optimization which took over form that point? Henrich’s thesis of gene-culture coevolution contrasts with this view and I find it to be much more likely to be true. For example, the former thesis posits that humans lost a massive amount of muscle strength (relative to, say, chimpanzees) over many generations and only once that process had been virtually “completed”, started to compensate by throwing rocks or making spears when hunting other animals, requiring much less muscle strength than direct engagement. This begs the question, how did our ancestors survive in the time when muscle strength had already significantly decreased, but tool usage did not exist yet? Henrich’s thesis answers this by saying that such a time did not exist; throwing rocks came first, which provided the evolutionary incentive for our ancestors to expend less energy on growing muscles (since throwing rocks suffices for survival and requires less muscle strength). The subsequent invention of spears provided further incentive for muscles to grow even weaker.
There are many more examples to make that are like the one above. Perhaps the most important one is that as the amount of culture grows (also including things like rudimentary language and music), a larger brain has an advantage because it can learn more and more quickly (as also evidenced by the LLM scaling laws). Without culture, this evolutionary incentive for larger brains is much weaker. The incentive for larger brains leads to a number of other peculiarities specific to humans, such as premature birth, painful birth and fontanelles.
You mean to say that the human body was virtually “finished evolving” 200,000 years ago, thereby laying the groundwork for cultural optimization which took over form that point? Henrich’s thesis of gene-culture coevolution contrasts with this view and I find it to be much more likely to be true. For example, the former thesis posits that humans lost a massive amount of muscle strength (relative to, say, chimpanzees) over many generations and only once that process had been virtually “completed”, started to compensate by throwing rocks or making spears when hunting other animals, requiring much less muscle strength than direct engagement. This begs the question, how did our ancestors survive in the time when muscle strength had already significantly decreased, but tool usage did not exist yet? Henrich’s thesis answers this by saying that such a time did not exist; throwing rocks came first, which provided the evolutionary incentive for our ancestors to expend less energy on growing muscles (since throwing rocks suffices for survival and requires less muscle strength). The subsequent invention of spears provided further incentive for muscles to grow even weaker.
There are many more examples to make that are like the one above. Perhaps the most important one is that as the amount of culture grows (also including things like rudimentary language and music), a larger brain has an advantage because it can learn more and more quickly (as also evidenced by the LLM scaling laws). Without culture, this evolutionary incentive for larger brains is much weaker.
In a sense, evolution never stops, but yes the capacity to make tools was way later than the physical optimizations that used the tools.
More generally, an even earlier force for bigger brains probably comes from both cooking food and social modeling in groups, but yes language and culture do help at the margin.
What do you make of feral children like Genie? While there are not many counterfactuals to cultural learning—probably mostly because depriving children of cultural learning is considered highly immoral—feral children do provide strong evidence that humans that are deprived of cultural learning do not come close to being functional adults. Additionally, it seems obvious that people who do not receive certain training, e.g., those who do not learn math or who do not learn carpentry, generally have low capability in that domain.
I actually think this is reasonably strong evidence for some form of cultural learning, thanks.
This was the faked evidence here:
For simplicity let’s concentrate on the seal hunting description. I don’t know enough about Inuit techniques to critique the details, but instead of aiming for a fair description, it’s clear that Henrich’s goal is to make the process sound as difficult to achieve as possible. But this is just slight of hand: the goal of the stranded explorer isn’t to reproduce the exact technique of the Inuit, but to kill seals and eat them. The explorer isn’t going to use caribou antler probes or polar bear harpoon tips — they are going to use some modern wood or metal that they stripped from their ice bound ship.
Then we hit “Now you have a seal, but you have to cook it.” What? The Inuit didn’t cook their seal meat using a soapstone lamp fueled with whale oil, they ate it raw! At this point, Henrich is not just being misleading, he’s making it up as he goes along. At this point I start to wonder if part about the antler probe and bone harpoon head are equally fictional. I might be wrong, but beyond this my instinct is to doubt everything that Henrich argues for, even if (especially if) it’s not an area where I have familiarity
While we’re at it, one example I learned afterwards was that the ‘caribou randomization’ story is probably bogus (excerpts):
What is particularly interesting to me is that the apparent beautiful match of this traditional hunting practice with contemporary game theory may be ‘too good to be true’ because it was actually the opposite: I suspect that the story was made up to launder (secret) game-theoretic work from WWII into academic writing; the original author’s career & funder are exactly where that sort of submarine-warfare operations-research idea would come from… (There were many cases post-WWII of civilians carefully laundering war or classified work into publishable form, which means that any history-of-ideas has to be cautious about taking at face value anything published 1940–1960 which looks even a little bit like cryptography, chemistry, physics, statistics, computer science, game theory, or operations research.)
Ah, I see now. Thank you! I remember reading this discussion before and agree with your viewpoint that he is still directionally correct.
While he is probably directionally correct, I think it should be way weaker than people think, and I now believe that something like the human body being optimized for tool use/the scaling hypothesis applied to evolution is my broad explanation of how humans have rocketed away from animals, and I now think cultural learning is way, way less powerful in humans than people thought, and this is partially updating from LLMs, which have succeeded but so far have nowhere near the capabilities that humans have (yet), and there’s an argument to be made that cultural learning was the majority of AI capabilities before o1/o3.
How do LLMs and the scaling laws make you update in this way? They make me update in the opposite direction. For example, I also believe that the human body is optimized for tool use and scaling, precisely because of the gene-culture coevolution that Henrich describes. Without culture, this optimization would not have occurred. Our bodies are cultural artifacts.
Cultural learning is an integral part of the scaling laws; the scaling laws show that indefinitely scaling the number of parameters in a model doesn’t quite work; the training data also has to scale, with the implication that that data is some kind of cultural artifact, where the quality of that artifact determines the capabilities of the resulting model. LLMs work because of the accumulated culture that goes into them. This is no less true for “thinking” models like o1 and o3, because the way they think is very heavily influenced by the training data. The fact that thinking models do so well is because thinking becomes possible at all, not because thinking is something inherently beyond the training data. These models can think because of the culture they absorbed, which includes a lot of examples of thinking. Moreover, the degree to which Reinforcement Learning determines the capabilities of thinking models is small compared to Supervised Learning, because, firstly, less compute is spent on RL than on SL, and, secondly, RL is much less sample-efficient than SL.
I think the crux is that the genetic changes come first, then the cultural changes come after, and the reason is the modern human body plan happened around 200,000 BC, and arguably could be drawn even earlier, whereas the important optimizations that culture introduced came over 100,000 years at the latest.
That said, fair enough on calling me out on the scaling laws point, I was indeed not able to justify my update.
As far as why I’m generally skeptical of cultural learning mattering nearly as much as Heinrich, it’s because I believe a lot of the evidence for this fundamentally doesn’t exist for human cultures, and a lot of work in the 1940s-1960s around lots of subjects that were relevant to cultural learning were classified and laundered to produce simulated results:
https://www.lesswrong.com/posts/m8ZLeiAFrSAGLEvX8/the-present-perfect-tense-is-ruining-your-life#MFyWDjh4FomyLnXDk
What do you make of feral children like Genie? While there are not many counterfactuals to cultural learning—probably mostly because depriving children of cultural learning is considered highly immoral—feral children do provide strong evidence that humans that are deprived of cultural learning do not come close to being functional adults. Additionally, it seems obvious that people who do not receive certain training, e.g., those who do not learn math or who do not learn carpentry, generally have low capability in that domain.
You mean to say that the human body was virtually “finished evolving” 200,000 years ago, thereby laying the groundwork for cultural optimization which took over form that point? Henrich’s thesis of gene-culture coevolution contrasts with this view and I find it to be much more likely to be true. For example, the former thesis posits that humans lost a massive amount of muscle strength (relative to, say, chimpanzees) over many generations and only once that process had been virtually “completed”, started to compensate by throwing rocks or making spears when hunting other animals, requiring much less muscle strength than direct engagement. This begs the question, how did our ancestors survive in the time when muscle strength had already significantly decreased, but tool usage did not exist yet? Henrich’s thesis answers this by saying that such a time did not exist; throwing rocks came first, which provided the evolutionary incentive for our ancestors to expend less energy on growing muscles (since throwing rocks suffices for survival and requires less muscle strength). The subsequent invention of spears provided further incentive for muscles to grow even weaker.
There are many more examples to make that are like the one above. Perhaps the most important one is that as the amount of culture grows (also including things like rudimentary language and music), a larger brain has an advantage because it can learn more and more quickly (as also evidenced by the LLM scaling laws). Without culture, this evolutionary incentive for larger brains is much weaker. The incentive for larger brains leads to a number of other peculiarities specific to humans, such as premature birth, painful birth and fontanelles.
In a sense, evolution never stops, but yes the capacity to make tools was way later than the physical optimizations that used the tools.
More generally, an even earlier force for bigger brains probably comes from both cooking food and social modeling in groups, but yes language and culture do help at the margin.
I actually think this is reasonably strong evidence for some form of cultural learning, thanks.