Using Ngram to estimate depression prevalence over time

David Gross9 Jul 2022 14:57 UTC

10 points

Historical language records reveal a surge of cognitive distortions in recent decades

My summary: People diagnosed with depression tend to exhibit characteristic patterns of language use that demonstrate the underlying cognitive distortions associated with depression.

For example… individuals label themselves in negative, absolutist terms (e.g., “I am a loser”). They may talk about future events in dichotomous, extreme terms (e.g., “My meeting will be a complete disaster”) or make unfounded assumptions about someone else’s state of mind (e.g., “Everybody will think that I am a failure”). Typologies of cognitive distortions generally differentiate between a number of partially overlapping types, such as “catastrophizing,” “dichotomous reasoning,” “disqualifying the positive,” “emotional reasoning,” “fortune telling,” “labeling and mislabeling,” “magnification and minimization,” “mental filtering,” “mindreading,” “overgeneralizing,” “personalizing,” and “should statements.”

The researchers looked for these language patterns in 14 million books, published over the past 125 years in English, Spanish, and German, that are available via Google Ngram, to see how their prevalence has changed over time.

They found that in general the prevalence of such language patterns decreased or stayed stable over the course of the 20th century up until around 1978. There were some local and temporary spikes (e.g. German-language texts between the world wars and after World War II, English-language texts in 1899 for some reason). In 1978, prevalence began to rise slowly, and then in 2000 more rapidly, leveling out again around 2008 at a historically-high level.

The authors conclude that there has been a recent rapid and strong rise in the use of language patterns that suggest the cognitive distortions associated with depression, in recent years in published books.

Some Concerns

Could it be that language fashions change in ways that are independent of depression but that overwhelm the effect of depression on the data? For example, if hyperbole goes in and out of fashion for purely aesthetic reasons, so will “catastrophizing,” “overgeneralizing,” “magnification and minimization.”
Similarly, writing today seems more direct, less baroque. A writer today might simply say “Everyone thinks I should exercise if I want to look better,” while a writer 125 years ago might say “It is a truth universally acknowledged that my countenance suffers from want of regular forthright bodily exertion.” Cognitive distortions expressed in a more modern, to-the-point way might be easier to discover in the way the researchers searched. (The authors note that sentence length, as a possible proxy for sentence baroqueness, has been mostly stable since the 1920s.)
Might the panel of CBT experts who created the set of phrases to search for have been more aware of current ways of expressing cognitive distortions (those they might hear examples of in their day-to-day lives or work), and less aware of archaic ways of doing so (those that may only exist today in books)? The authors tried to adjust for this by comparing their results to a “null model” of random n-grams, where they chose the sample for this null model to have a recency bias (more n-grams chosen from recently-published books). But their graph of the null model strangely shows more prevalence of those n-grams early in the 20th century than later, so I’m not sure I believe it.
I notice a somewhat similar pattern of slow decline through the 20th century followed by a spike in the 21st century for an Ngram search for first-person markers (e.g. I, me, my). Could the whole phenomenon just be explained by a trend toward more first-person narration? The authors say that “the prevalence of the CDS n-grams in the language of individuals with depression is not affected by… the presence of personal pronouns” so they don’t think that’s a factor. But I’m not convinced that really addresses the problem.
Authors of books are a peculiar sample of the population at large, as are narrators of fiction. The way those samples have been taken over time could bias the results.

David Gross9 Jul 2022 14:57 UTC

10 points

3 comments2 min readLW link

Depression World Modeling

Zac Hatfield-Dodds 9 Jul 2022 18:16 UTC
7 points
0
I don’t believe that this methodology actually provides meaningful evidence for their claims. To quote the paper, which IMO is still talking down the problem:

We caution that changes in meaning or semantic shift of the CDS n-grams may potentially bias our results. …

the choice of CDS n-grams could lead to a “recency bias” in our results, explaining their rise in prevalence in recent decades. [their ‘control’ for this is IMO irrelevant]

We caution that although the Google Books data have been widely used to assess cultural and linguistic shifts, and they are one of the largest records of historical literature, it remains uncertain whether CDS prevalence truly reflects changes in societal language and societal wellbeing. Many books included in the Google Books sample were published at times or locations marked by reduced freedom of expression, widespread propaganda, social stigma, and cultural as well as socioeconomic inequities that may reduce access to the literature, potentially reducing its ability to reflect societal changes.

Note that the n-grams from (17) are in a 2020 paper on Twitter, which is a rather different corpus to published books! From that one:

we relied on individuals reporting their personal clinical depression diagnoses on social media [and] recommend caution when generalizing our findings to the level of all individuals who have depression. … Our lexicon of CDS was composed and approved by a panel of ten experts who may have been only partially successful in capturing all of the n-grams used to express distorted ways of thinking. On a related note, the use of CDS n-grams implies that we measure distorted thinking by proxy, namely through language, and our observations may be therefore be affected by linguistic and cultural factors. Common idiosyncratic or idiomatic expressions may syntactically represent a distorted form of thinking, but no longer do so in practice.

We emphasize that not all use of CDS n-grams reflects depressive thinking, as these phrases are part of normal English usage, and it would therefore be wrong to try to diagnose depression merely on the basis of use of one or more such phrases.
Wajax 9 Jul 2022 17:14 UTC
7 points
0

In 1978, prevalence began to rise slowly, and then in 2000 more rapidly, leveling out again around 2008 at a historically-high level.

I think it’s interesting to consider how those trends might correlate with rising numbers of people identifying as non-religious.

Regardless of whether changes in religion caused an increase in depression, I think it’s certainly possible that it influenced how people might have felt writing about their life and experience.

Speaking as someone raised in a fundamentally religious setting, there’s often a certain kind of guilt associated with expressing negativity about oneself or ones life. No matter how bad you feel about yourself, definitively calling yourself a “loser” would be an affront to the creator.

A lot of those typologies of cognitive distortions would be read as vain/worldly/unfaithful towards a divine plan. People might be experiencing all of that internally, but interpreting it as a spiritual failing to be expressed through spiritual language, if at all.
quanticle 9 Jul 2022 20:45 UTC
2 points
0

The researchers looked for these language patterns in 14 million books, published over the past 125 years in English, Spanish, and German, that are available via Google Ngram, to see how their prevalence has changed over time.

(emphasis mine)

Given that what is published is a tiny, highly selected fraction of what is said, spoken, etc, why should we feel confident in drawing any population-wide conclusions at all from a study of published work? Even if we limit the relevant population to authors, I would be hesitant to draw any conclusions, given that only a fraction of what authors write gets published, and what is published often goes through multiple rounds of editing before it hits the presses.

Maybe it’s just that cultural tastes have shifted so that more open discussions of poor mental health are acceptable, and, as a result we see greater representation of that in published work.