(Book Review) The Genetic Lottery: Why DNA Matters for Social Equality

Consider this: Your child is struggling in school. What might be going on? Perhaps home life is chaotic and stressful. Perhaps the school itself lacks resources. Maybe other kids bully them because of their kinky hair or strange accent or their hand-me-down clothing. But Kathryn Paige Harden, in her book The Genetic Lottery: Why DNA Matters for Social Equality, argues that we’re still missing something important—your kid’s genes.

Her book, and much of her research, focuses on how genes affect educational attainment, and for good reason—education and well-being are connected, especially in America. For example, the adult life expectancy of Black Americans with a bachelor’s degree is now closer to white Americans with a degree than to Black Americans without a degree. This gap in how education affects life expectancy is widening, even as the corresponding racial and gender gaps in life expectancy are narrowing. So if you’re concerned about social inequalities, you should be concerned with education.

According to Harden, genes predict educational attainment just about as well as family income does. Which is to say, a good bit! And because genes are just as unearned as other benefits of one’s birth, Harden wants to use genetic knowledge to improve social policy and create a more egalitarian, anti-eugenic future:

Like being born to a rich or poor family, being born with a certain set of genetic variants is the outcome of a lottery of birth. You didn’t get to pick your parents, and that applies just as much to what they bequeathed you genetically as what they bequeathed you environmentally. And, like social class, the outcome of the genetic lottery is a systemic force that matters for who gets more, and who gets less, of nearly everything we care about in society.

Her Argument

If you’ll remember, the book’s subtitle is Why DNA Matters for Social Equality. So, why does it? Harden’s central argument is quite simple:

Once you know some scientific basics—experimental design, causation, counterfactuals, control groups—it’s clear that genetics is a confounding variable in social policy research and intervention.
But we haven’t treated it as a confounder. Instead, we’ve just ignored it because it feels dangerous. (This is what she calls the genome-blind position.)
Because we’ve ignored it, our social policy is more costly and less effective than it could be.

Her argument is pretty convincing. We haven’t really tried to use genetic knowledge to improve social policy, so why not try? But Harden is bringing genes into the picture here, so she’s got to distinguish herself from the genetic conservative standpoint, typified by Charles Murray of The Bell Curve, lest she be the next person to get punched at Middlebury. To Harden, genetic conservatives believe three things:

First, that the causal chain between genetics and social inequality is short and primarily mediated via the development of intelligence. Second, that the causal chain between genetics and social inequality is best understood at a cellular and organismic level of analysis, with intelligence seen as an inherent property of a person’s brain, rather than as something that develops in a social context. And, third, that the alternative possible worlds where this chain is broken are dystopian, requiring either massive state intrusion into people’s home lives or widespread genetic engineering.

In other words, they’re determinists and pessimists. You can’t change this stuff through social policy, stupid! The causal chain from genetics to outcomes is too short and strong to break! But Harden is neither a determinist nor a pessimist, and she gives a couple of examples where social policy can intervene between genetic cause and outcome.

First, glasses: Nearsightedness is heritable, we use glasses to address it, and glasses are not a genetic solution. Et voila! A social intervention for a genetic problem. Second, verbal fluency: The time when a child starts talking is heritable. But when a child starts talking earlier, this encourages the parent to talk with them more and use a larger vocabulary. This improves the child’s ability to speak, leading to more interaction, more stimulation, and even greater verbal fluency. It’s a positive feedback loop that’s socially mediated! A potential policy intervention might just look like a social worker telling a parent, “Even if your kid doesn’t start talking by age two, make sure that you still talk with them often, because that will help them speak.”

Given this premise, let’s jump into the research...

A Tree with Rotten Roots

...or not. Harden has, uh, a few tripwires to avoid first. The early chapters of Lottery walk through a potted history of statistics and genetics, describing how many progenitors of those fields believed in deep and fixed human hierarchies, and how research in those fields has been used to justify and oppressively reinforce those hierarchies. Galton, Fisher, Pearson; exclusion, sterilization, murder. It’s horrible and fairly well known.

But ignoring how genes affect outcomes, Harden says, exposes us to two terrible consequences: First, we’ll fail to help the most vulnerable, since we cannot enact informed, effective social policy. Second, we’ll cede the intellectual landscape to purveyors of anti-egalitarianism and eugenics, since all we can do is scream la la la I can’t hear you! as inconvenient data keeps flowing in that, you know, genetics kinda matters.

Given the dangers, though, let’s make a few things clear: The genetics research described here can only tell you about genetic differences within an ancestry group, not between those groups. Ancestry groups are not racial groups—racial groups are social constructs that only have extremely tenuous links to biology.^[1] These studies cannot and should not be used to explain anything about racial gaps in educational attainment, test scores, and such.

Okay, now let’s jump into the research...

Goodbye Candidate Genes, Hello GWAS

In the days of yore, when the human genome was first mapped, we believed we would soon find strong links between particular genes and particular outcomes. Think “the gay gene” or “the depression gene.” This model—the candidate gene—is wrong for most things, and the research design it entails has produced little (but not zero) scientific knowledge. At times, it even created entire cottage industries of researchers waving around red herrings like broadswords—such as in the case of 5-HTTLPR, the serotonin transporter gene. To explain this, Harden pulls in a familiar figure:

Liberated from the polite conventions of scientific journals, the psychiatrist and blogger Scott Siskind summarized the paper’s conclusion [about the 5-HTTLPR candidate gene] more colorfully. He denounced investigators who reported “results” on 5-HTTLPR as fabulists telling stories about unicorns, except worse: “This isn’t just an explorer coming back from the Orient and claiming there are unicorns there. It’s the explorer describing the life cycle of unicorns, what unicorns eat, all the different subspecies of unicorn, which cuts of unicorn meat are tastiest, and a blow-by-blow account of a wrestling match between unicorns and Bigfoot.”

So when does the candidate gene model fail? When tons and tons of genetic variants each provide a very small effect. To study genetic effects of this type, you need a new kind of research design—a genome-wide association study, or GWAS. Here’s how it works in a nutshell:

Collect genetic information from tons of people, usually a million or more.
Ignore most of it. The “genome-wide” in GWAS is a little bit of a misnomer. We share over 99% of our DNA, so we don’t need to analyze that, and the GWAS only looks at a fraction of the last percent, since parts of a genome are highly correlated through genetic linkages.^[2]
Calculate correlation coefficients between genetic variants and the trait or outcome of interest, such as height or weight or educational attainment.
Create a polygenic index for a person of interest by adding up the correlation coefficients for the relevant genetic variants they have.

Reprinted from “The Principle of a Genome-wide Association Study (GWAS)”, by BioRender, June 2020, Copyright 2021 by BioRender. (Retrieved from https://app.biorender.com/biorender-templates/t-5f17525a178b5200b0fb7b06-the-principle-of-a-genome-wide-association-study-gwas)

To be clear, a GWAS is just a bunch of correlations between genes and outcomes, and a polygenic index is just the sum of the relevant correlation coefficients for a given person based on their specific genes. They won’t tell you anything about whether a gene caused an outcome. So, to understand causation, you must use the GWAS in a study that controls for upbringing and environment, such as by comparing sibling against sibling, or adoptee against non-adoptee.

Likewise, even if you establish causation, a GWAS won’t tell you the mechanism by which the gene caused the outcome. A hundred years ago, genes for female sex would be associated with less educational attainment, but the gene only “caused” the outcome in a vague way—it was necessary but not sufficient for the outcome. Without society’s discrimination, the gene wouldn’t have done anything in this example.^[3]

Genes and Environment

Up top, I mentioned that genes—specifically, results from GWAS—predict educational attainment just about as well as family income does. How well is that? About fifteen percent.

“Fifteen percent? That’s not much,” you might say. “And family income predicts fifteen percent, too. If we know even more about someone’s environment, can’t we make pretty good predictions?” Well, Harden has an answer...

The Fragile Families and Child Wellbeing Study is an ongoing study of over 4,000 families who were recruited for a study of child development when their children were born. The children have since been measured on a raft of variables when they were 1, 3, 5, 9, and 15 years old—their parents, teachers, and eventually the children themselves were surveyed about, e.g.,’”child health and development, father-mother relationships, fatherhood, marriage attitudes, relationship with extended kin, environmental factors and government programs, health and health behavior, demographic characteristics, education and employment, and income; parental supervision and relationship, parental discipline, sibling relationships, routines, school, early delinquency, task completion and behavior, and health and safety.”

In other words, everything about a child’s environment and development that researchers could possibly think to measure. [Emphasis added.] Right before the study investigators released the data from the measurements taken at age 15, they devised a challenge: Teams of scientists were tasked with trying to predict the children’s outcomes at age 15, using as many variables and as fancy statistical methods as they wanted. Ultimately, over 160 teams of scientists participated in the challenge, and each team was given access to over 12,000 variables about a child and their family.

Take a moment here to consider. 12,000 variables. 160 teams of professionals. Six outcomes to predict. This happened in 2017. How much variance do you think the best model accounted for?

Twenty percent, just a bit more than GWA studies, for all known environmental factors. Harden is not saying we shouldn’t study the environment—in fact, we must study the environment, because genes and environment interact—but simply that researchers must “lower their expectations regarding any variable, environmental or genetic.” Everyone needs a good dose of epistemic humility here—don’t get cocky.

Even if we could know a lot more just by looking at the environment, though, many parts of human choice and environment are simply not open to experimentation. For example, having sex for the first time at an earlier age correlates with some mental health issues, but we can’t do a randomized controlled trial to determine causality—as Harden puts it, “Hi, we’ve drawn your name out of a hat so now you have to wait until you are 25 to lose your virginity.” Yeah, that’s a no-go. As one outcome of this, the Education Code of Texas requires students to learn that having sex as an unmarried teenager causes emotional trauma. But a recent GWA study found genes that correlate both to earlier sex and to ADHD and smoking—hey look, a genetic confounder!—which might pave the way toward more accurate, sane educational policy.

Twin Studies and Missing Heritability

If you’re familiar at all with genetics research, at this point you might ask, “But what about twin studies? Are those obsolete now?” Harden doesn’t discount twin studies—indeed, her group uses them too. But GWA studies have a critical advantage: they let you look at how outcomes correlate with individual genes, while twin studies only give you a single heritability value.

Twin studies may be too broad to help us plan specific interventions, but they do reveal a big problem in behavioral genetics: they suggest that most traits are much more heritable than GWA studies suggest. For example, twin studies estimate educational attainment is 40% heritable, while GWA studies estimate it at 13%. This “missing heritability” problem demands attention. After all, twin studies and GWA studies can’t both be right on the money. So which is wrong? Likely both.

GWA studies might underestimate heritability in a couple ways: First, they don’t look at the whole genome yet—remember that stuff about genetic linkages and de novo mutations?—so we could be missing some variants with large effects. Second, even though they already assess millions of people, these studies may need even more people to boost their statistical power, letting them find genetic variants with extremely small effects. Since the outcomes in question are polygenic, the effects from thousands (or tens of thousands) of missing genes could all add up to a large overall effect!

Likewise, twin studies might overestimate heritability. One central issue with twin studies is that they assume that parents will treat identical twins as unique individuals in the same way as they would treat fraternal twins. In Harden’s words: “If you’ve ever seen twins dressed in outfits that perfectly match, down to their socks and hair bows, that assumption might seem like a bit of a stretch.” Twin studies also face difficulties from non-random recruitment, leading to selection bias.

All in all, twin studies tend to find heritabilities at the high end, GWA studies tend to find heritabilities at the low end, and other studies such as sibling regression and disequilibrium regression tend to find heritabilities in the middle. We don’t know exactly how much genetics matters, but it does at least somewhat. As the First Law of Behavioral Genetics says, “Everything is heritable.”

A Constrained Argument

If Harden’s central argument is “we should use genetics to control confounders in social policy research and intervention,” is that her whole project? Yeah, basically. You’ll be disappointed, as I was, if you’re hoping to hear her talk about embryo selection or genetic editing.

Now, I can sympathize—addressing these other uses with care would be difficult. It would double the book’s length and make this minefield of a topic even more fraught. But when it comes to embryo selection, Harden doesn’t just set it aside as worthy of another book, but rather downplays it as A Thing That Her Peers Don’t Care About Yet:

Often, the phrase that animates [my colleagues] the most is not a phrase with clickbait allure like “embryo selection” or “personalized education.” Rather, the phrase scientists who do work in this area keep returning to is a dry-sounding statistical concept—”control variable.”

Look, for example, at the extensive FAQ written by the Social Science Genetics Association Consortium to accompany the publication of their 2018 GWAS of educational attainment. It was extremely pessimistic about using an education-associated polygenic index for “any practical response,” because the index “is not sufficient to assess risk for any specific individual.” The only application they did endorse? “The results of our study may be useful to social scientists, e.g., by allowing them to construct polygenic scores that can be used as control variables.”

But this is a bit of a dodge, right? 23andme is using polygenic risk scores for type-2 diabetes. Genomic Prediction is using polygenic indices for embryo selection. The first polygenically screened baby is here—Aurea Smigrodzki, born in 2020. We need to get a handle on this stuff. Harden herself even mentions an interesting, alarming case of polygenic selection:

As technology for measuring the genome improves by leaps and bounds, a new way to stack the genetic deck has become possible—pre-implantation genetic diagnosis, or PGD. PGD allows couples who have used IVF to create several embryos to screen those embryos genetically, in order to select which ones to implant and which ones to discard. PGD is most commonly discussed as a potential means to create so-called “designer babies,” i.e., embryos that have been selected for characteristics like height or eye color, which might be considered socially desirable but are not medically necessary. But it also raises the possibility of “negative” selection—selection for characteristics that most prospective parents would find undesirable, like deafness. A landmark survey of fertility clinics in the US found that a small number of clinics (3%) admitted to using PGD to help parents select embryos for a disease or disability. Similarly, a survey of Deaf parents found that a small minority would consider terminating a pregnancy if a genetic test found that the fetus would be hearing.

There’s also a Fat community; might they follow the Deaf community and select embryos for fatness potential? Or might the Blind community select for blindness? If these communities value the traits that help define them, is selecting for these traits eugenic? Dysgenic? Both at once? Neither? Do these categories make sense anymore? This is not a call to moral panic, but deep philosophical questions remain here.

In Harden’s defense, she does say such questions are outside the scope of the book. But even so, it feels glib for her to say that we shouldn’t worry too much about these uses of polygenic indices simply because her colleagues aren’t interested in them. They already inform life and death.

Editing Ignored

Compared to polygenic selection, Harden is even more silent on questions of genetic editing. The index has no entries for “genetic editing” or “CRISPR” or similar things, and as far as I can tell she only discusses genetic editing in any substance once, on page 160. Here is the full relevant text:

In fact, despite the fact that PKU [a disorder caused by a single-gene mutation] has a simple and well-understood genetic etiology, environmental solutions currently remain the only solutions. Gene therapy for PKU is not (yet) a reality. And we can contrast the simple etiology of PKU with the genetic architecture of highly polygenic outcomes, like intelligence test scores or educational attainment, which involve thousands upon thousands of genetic variants with tiny effects and unknown mechanisms. To make matters even more complicated, many of these variants are also involved in phenotypes that are valued differently by society: many of the same genetic variants associated with higher educational attainment, for instance, are also associated with higher risk for schizophrenia. The suggestion from some conservative academics that we might edit children’s genomes to increase their IQs is not just scientifically unfeasible; it is scientifically absurd. [Emphasis added.]

Is it scientifically unfeasible right now? Yes, it’s still costly and risky. However, DNA sequencing has fallen in cost by six orders of magnitude in the past twenty years, much faster than Moore’s law. (Forgive the NIH for the awful 3D plot.) Exponential changes are—as COVID demonstrated—surprising, and we may see a comparable shift in DNA editing. We’re not at the transhuman moment yet, but it might catch us off guard if we’re not careful.

But “scientifically absurd?” I’m honestly confused about what Harden is saying here. How is it absurd to engage with trade-offs? Let’s do it right now. Would you take +5 IQ for a +5% chance of developing schizophrenia? I would. What about a +50% chance? Maybe. +500%? I’m probably being unfair to Harden here, but really, it’s strange that this sentence is the only real discussion of genetic editing in a book about behavioral genetics, and if you’re going to use that one sentence to say that it’s absurd, you need to be clearer about why. Otherwise it’s almost a non sequitur.

Hmm, Defining Anti-Eugenics Is Hard

In the book’s final chapters, Harden presents an anti-eugenic position to counter the orthodox genome-blind position and the opposing eugenic position, along with a set of anti-eugenic policies. In many ways, her position is coherent and reasonable—for example:

Don’t waste time, money, talent, and tools by ignoring genetics. Instead, account for genetic information in social policy research and intervention in order to improve people’s lives.
Don’t use genetic info to categorically exclude people from health care, education, housing, lending, insurance, etc. For example, landlords should not deny someone housing just because they have a low polygenic index in some factor. Instead, use genetic info to improve planning, accommodation, and services for the least advantaged.
Don’t talk about “merit” as if it has nothing to do with unearned advantages, including genes. Instead, account for genetic luck to create fairer, apples-to-apples comparisons, such as when comparing how well schools are improving their teaching, given the students they have.

Much of the time, Harden describes her anti-eugenic position in standard left-liberal, Rawlsian terms: we need to balance utilitarianism and egalitarianism, ground ourselves in human rights, and focus on the people who are most vulnerable in society. This is all pretty agreeable. However, Harden can’t decide whether her anti-eugenic position is principally about doing the most for the least advantaged or about reducing inequalities. After all, even if you do the most that you can for the least advantaged, and you still have resources left over to contribute to people who are more advantaged, the gap still might widen between the least and most advantaged!

In her words, the anti-eugenic position “does not discourage genetic knowledge but deliberately aims to use genetic science in ways that reduce inequalities in the distribution of freedoms, resources, and welfare.” She also asks us to “[u]se genetic data to accelerate the search for effective interventions that improve people’s lives and reduce inequality of outcome.” So that’s a focus on equity! But elsewhere, she describes broadly shared progress as something that can produce or even depend on inequality:

We can also look to recent history to see enormous gains in life span, literacy, wealth, and well-being that ultimately worked to everyone’s advantage… The innovations in science, technology, and government that improved people’s lives were inequality-producing: some people’s lives were made better, quicker, than others and these innovations, in some cases, were inequality-dependent, in that they were made possible by a system that differentially rewarded different types of skills. But it’s to everyone’s advantage to live in a society where we don’t lose one-third of our children [as she described in a previous example]. … rewarding certain skills might be instrumentally useful for society as a whole, even as we recognized that people didn’t deserve the fact that they inherited genetic variants that were among the causes of those skills. [Emphasis added.]

Compare this to how Harden defines eugenics in another place:

What is eugenic is attaching notions of inherent inferiority and superiority, of a hierarchical ranking or natural order of humans, to human individual differences, and to the inheritance of genetic variants that shape these individual differences. What is eugenic is developing and implementing policies that create or entrench inequalities between people in their resources, freedoms, and welfare on the basis of a morally arbitrary distribution of genetic variants. [Emphasis added.]

Here, Harden’s first eugenics is a moral eugenics, a moral hierarchy of inherent value based on genetics. Harden’s second eugenics is a distributional eugenics, a set of policies that create or entrench inequalities based on genetics. But we saw Harden argue just above that such policies might be justified. Perhaps what makes unequal distribution “eugenic” to Harden when the arrow of justification points from moral eugenics to distribution, using instrumental questions as mere cover? Perhaps Harden believes that anti-eugenic policy can still have unequal resource distribution as long as it’s justified in Rawlsian terms and balanced with other values such as traditional equity? It’s all very fuzzy! (If you want to read even more discussion about Harden’s fuzzy definition of eugenics and anti-eugenics, check the Addendum below.)

Even with all the muddle of what “eugenics” means and what eugenic policies look like, Harden does make her stance about the moral worth of people crystal clear: even if someone’s skills are socially valued, earning them respect and money from others, that doesn’t make them any more morally valuable: “we risk conflating these [socially valued] skills and behaviors with human character and worth. Connecting people’s biology to their virtue, righteousness, and moral deservingness is a eugenic idea...” Someone’s skills don’t give them any more dignity.

This might seem obvious, but this is yet another place where Harden pits against herself against conservatives like Charles Murray, who “describe economic productivity (‘putting more into the world than [one] take[s] out’) as ‘basic to human dignity.’” Perhaps this conflict points to a deeper question: What does “dignity” mean, and what is its relationship with society? Is dignity intrinsic, unchanging, and inalienable? Can you earn it and lose it? Is it personal, something you simply feel yourself to have or not? Or is it social, something that you make real by expression to your larger society? And if so, is it something that society can strip from you? I’ll leave you with these philosophical questions to think about, since they are so fundamental to the concerns and conflicts in this book.

Conclusion

This review has been heavy on critique, but that’s only because I think Harden’s book is overdue, brave, and insightful. I agree with Harden’s central argument, but I wish she’d tackled more facets of behavioral genetics, and I want more clarity on key terms like “eugenics.” Ultimately, The Genetic Lottery is a critically important book, and there’s so much interesting stuff in the book that I didn’t get into here. The free will coefficient! Executive function perhaps being 100% heritable! The heritability of “non-cognitive” abilities! There’s really a lot to chew on, and it’s a treat. Go read it.

Addendum: More Rants About Definitions

So, you want to read more discussion about Harden’s hazy definitions? Got it!

Harden’s explanation of what “eugenics” and “anti-eugenics” mean becomes even fuzzier when you add in her discussion about refocusing policy discussions away from comparing outcomes between people and toward maximizing individuals’ potential:

What if your genotype were exactly the same, but the social context changed? In other words, we are now comparing each person to themself across alternative possible worlds, rather than comparing people to each other within a world. Considered this way, the salient question is not which world minimizes the inequalities in outcome among children, but which world maximizes the outcome for an individual person.

But whose outcomes do we prioritize? Consider, for instance, a school that is adopting a new math curriculum and has a choice between two proposals. Which would be more preferable—(a) or (b)?

(a) The new curriculum is particularly helpful for children who are most genetically “at risk,” thus reducing the gap in educational outcomes between children who did and who did not happen to inherit a particular combination of genetic variants.

(b) The new curriculum is particularly helpful for children who are most likely to succeed anyway, thus inculcating even higher levels of mathematics skill among a few students.

Option (b) sounds suspiciously like a “eugenic” policy as Harden defined before, right? It certainly entrenches inequalities based on genetically inflected predictions, so you might assume Harden would discount this strategy. But she doesn’t:

Reasonable people could make a variety of empirical arguments for (a) versus (b). For instance, one might bring various cost-benefit analyses to bear: How many students are helped by (a) versus (b)? How much will a new curriculum cost per student? What are the downstream impacts (in terms of economic productivity, technological innovation, social cohesion, political participation, etc.) of having more people in a society who have a certain baseline level of mathematical skills versus having more people in a society who have a very high level of mathematical skills?

These factors are certainly important to address, but it still doesn’t help us much to understand what Harden believes is an acceptable policy strategy and what isn’t. These kinds of decisions are super complicated, so I don’t expect a fleshed-out, coherent set of the exact policy choices that would work best—I just want to know what kinds of policies she thinks are eugenic or not, and I’m not convinced she’s told me.

That said, Harden does leave us with an important point—these policy choices also reflect value judgments, that we must be honest and transparent about their judgments, and that we can’t even be transparent about them yet because we’re not accounting for genetics:

In addition to these empirical questions, however, this choice also involves questions about people’s values, including whether one values equality of educational outcomes as an end, a good thing to be pursued for its own sake, or simply as a means to some other goal, such as equality of economic outcomes. Currently, however, policymakers and educators do not have to be transparent about those values, nor do they have to be confronted with evidence regarding whether the realized effects of policies or interventions are living up to those values. In educational and policy research, genetic differences between people are largely invisible, because researchers do not even try to measure anything about people’s genetics.

↩︎
This is not saying there are absolutely no correlates between socially defined race and genetic ancestry groups. Black Americans, for example, are more likely to have sickle-cell anemia than white Americans, for example, but ultimately the concept of race in genetics does more harm than good. Instead, we need to take a more nuanced view about how various genetic lineages get mixed and matched among all the different socially defined racial groups. There’s a whole chapter on pulling apart genetic ancestry and race, but it’s too involved to cover in detail here.
↩︎
Well, technically the 99% of DNA we share can have some variation because of de novo mutations, but the GWAS assumes that there are no mutations. If you really need the whole genome, you can get it through “fine-mapping,” but this is less common.
↩︎
This means that a heritability value will change based on context! In a society that is more discriminatory against women, educational attainment will appear more heritable since it is more correlated to genes for female sex. However, heritability can increase in more ideal conditions, too. If everyone has the resources they need, and all the barriers are gone—that is, the environment plays very little role because everyone has what they need—then you would expect genetics to explain more of the effect.