Blood Is Thicker Than Water 🐬

Followup to: Where to Draw the Boundaries?

Without denying the obvious similarities that motivated the initial categorization {salmon, guppies, sharks, dolphins, trout, ...}, there is more structure in the world: to maximize the probability your world-model assigns to your observations of dolphins, you need to take into consideration the many aspects of reality in which the grouping {monkeys, squirrels, dolphins, horses ...} makes more sense.

The old category might have been “good enough” for the purposes of the sailors of yore, but as humanity has learned more, as our model of Thingspace has expanded with more dimensions and more details, we can see the ways in which the original map failed to carve reality at the joints …

So the one comes to you—a-gain—and says:

Hold on. In what sense did the original map fail to carve reality at the joints? You don’t deny the obvious similarities between dolphins and fish—between dolphins and other fish. That’s a cluster in configuration space! The observation that dolphins are evolutionarily related to mammals may be an interesting fact that specialized professional evolutionary biologists care about for some inscrutable specialist reason. But I’m not a professional biologist. Choosing to define categories around evolutionary relatedness rather than macroscopic human-relevant features seems like an arbitrary æsthetic whim. Why should I care about phylogenetics, at all?

This one is going to take a few paragraphs.

Focusing on evolutionary relatedness is not an arbitrary æsthetic whim because evolution actually happened. Evolution isn’t just a story that our Society’s specialists happen to have chosen because they liked it; they chose it because it predicts what we see in the world. You can’t choose a substantively different theory and make the same predictions about the real world. (At most, you’d end up with an isomorphic theory with additional epiphenominal elements, asserting that an allele rose in frequency “because” the angels willed it, without an account of why the angels’ will happens to line up with what would have transpired if there were no angels.) Similarly, category definitions represent hidden probabilistic inferences; you can’t “redraw” the “boundaries” of the categories your mind actually uses and still make the same predictions about the real world. Accordingly, it shouldn’t be surprising that our knowledge of evolution turns out to have implications for how we should categorize organisms—not as an æsthetic choice, but for structural reasons that can be understood mechanistically.

One element of the evolutionary worldview is a “continuity” postulate: all else being equal, creatures that are more closely related are more similar in general. Creationists sometimes try to discredit evolution by ridiculing the absurdity of the idea that a monkey could give birth to a person. But actually, evolutionary biologists agree on the absurdity of that specific scenario. Monkeys don’t suddenly give birth to humans in a single generation; if they did, that would utterly falsify our understanding of evolution! Rather, monkeys and humans had a common ancestor forty million years ago, with the separate lines of descent leading to present-day monkeys and present-day humans each accumulating their own differences one mutation at a time.

The fact that evolution persists information in the genome creates a regularity in the world that can be exploited by cognitive algorithms that know about phylogeny. In terms of the formalization of causality with directed acyclic graphs pioneered by Judea Pearl and others, an organism’s genome is at the root of the causal graph underlying all other features of an organism:

In the language of causal graphs, conditioning on the “dolphin DNA” node in the diagram d-separates the paths between the “blowhole” and “flippers” nodes that run through the “dolphin DNA” node. That means that—assuming there aren’t any other paths between “blowhole” and “flippers” that don’t go through “dolphin DNA”—”blowhole” and “flippers” become conditionally independent given “dolphin DNA”: when I see a creature with a blowhole, that makes me more likely to think it’s a dolphin, which makes me more likely to think it has flippers, but given that I already know something is a dolphin, learning more about its flippers doesn’t change my predictions about its blowhole.

But conditional independence assertions of this kind are exactly what makes “categorizing” a useful AI technique in the first place. It’s often helpful to visualize this by claiming that entities in the same category belong to a cluster in some configuration space, but this handy visual metaphor is lacking in rigor and well-definedness.

What space? What do the dimensions of this space represent? “Features”? But there are no pre-existing “features” in the world. Assuming the existence of a “space” up front is punting on most of the actual AI challenge. “There’s conditional independence structure in the causal graph” is a meaningfully deeper explanation than “There’s a cluster in configuration space”, because conditional independence is what what makes it possible to construct a “space” such that there are clusters. (Though this isn’t a complete explanation: we still need to figure out where the “variables” in the causal graph come from.)

Going beyond the configuration space metaphor is important because it lets us understand how we can learn new things about dolphins that we don’t already know. Dolphins are complicated! Dolphins are complicated in a very specific way. Dolphins are fragile: the shortest computer program that simulates a dolphin requires many bits of initial information, and if you changed some of the bits, you wouldn’t have a dolphin anymore. Complex functional adaptations are universal within a species because each beneficial allele has to reach fixation before there can be selection pressure for the next incremental improvement. That’s why it’s possible to claim that there are 206 bones in “the” human skeleton, even if most humans haven’t had their bones counted. I haven’t been able to find a citation on how many bones dolphins have, but I’m confident that it’s the same number for all or nearly-all members of a particular dolphin species.

But “number of bones” wasn’t one of the dimensions of the “space” that we originally noticed the dolphin cluster in! That’s what the “carving reality at the joints” metaphor means: genetic relatedness is an underlying generator of similarities, that includes the “finned swimmy animals” properties that dolphins and fish have in common, but also includes many more high-dimensional details: how dolphins are warm-blooded, how dolphins have eyelids, the way female dolphins nurse their live-born young, the way male dolphins sometimes gang-rape female dolphins, the way dolphins sleep with only half their brain at a time, the specific bones in the (the!) dolphin skeleton (however many there turn out to be), the way dolphins swim in a circle to trick fish into jumping and being eaten, &c.

In contrast, “finned swimmy animals” is an intrinsically less cohesive subject matter: there are similarities between them due to convergent evolution to the aquatic habitat, and it probably makes sense to want a short word or phrase (perhaps, “sea creatures”) to describe those similarities in contexts where only those similarities are relevant.

But that category “falls apart” very quickly as you consider more and more aspects of the creatures: the finned-swimmy-animals-with-gills are systematically different from the finned-swimmy-animals-with-a-blowhole, in more ways than just the “respiratory organ” feature that I’m using in this sentence to point to the two groups.

A “definition” is just a description that helps someone else pick out “the same” natural abstraction in their own world-model: you can’t pack everything there is to know about dolphins into the definition of the word “dolphin”, in part because we don’t know everything there is to know about dolphins as an empirical regularity in the real world. The “finned swimmy animals” category less useful to the extent that it fails to compress more information than is contained in its definition. Blood is thicker than water (that is, the similarities induced by shared blood cluster in a “thicker” (higher-dimensional) configuration space than the similarities induced by living in the water).

The one replies:

But what if I don’t need to compress any more information than “finned swimmy animals”? If I’m watching a nature documentary, I don’t think I’m being done any favors by having word-structures that group lungfish and lamprey while excluding sea turtles. In general, the concepts I find useful respond to my immediate needs. I care more about “would be at home atop a fruit pizza” rather than “everything anatomically analogous to an apple”. When a child points at a whale and says “look, a fish”, and you’re like “haha no, its tail flaps horizontally and its grandma had hair”, who’s in the wrong here?

In some sense, sure: ignorance isn’t better than knowledge if you don’t care about knowing things. If you live in human civilization and don’t need to carve up the world of aquatic life in much detail—if your use-case for thinking about aquatic animals is watching a nature documentary (for entertainment??) rather than living and working with them every day, then you might think the deeper causal structure isn’t buying you anything. And for you and your extremely limited use-case, maybe it isn’t. But you would likely change your mind if you were a veterinarian or a zoologist who actually had skin in the game in robustly describing this part of the world.

When people have skin in the game, they care about the underlying mechanisms and want short codewords for them, because the underlying mechanisms sometimes have decision-relevant implications. If you hurt your ankle while running, you would probably be interested to know whether it was a sprain or a stress fracture because that affects your decisions about how to recover. You wouldn’t say, “Well, all I know is that my ankle hurts—that’s all a child would know—so I’m going to call it a hurtankle; I don’t care about anatomy.”

You may not be intrinsically curious about anatomy, but even if the only thing you care about is relief from pain and recovering your mobility, you still benefit from living in a Society whose shared ontology distinguishes sprains and stress fractures being different things in the territory, even if they compress to the same point in your map of how much your ankle hurts right now. And you probably also benefit from living in a Society that can stabilize a shared map of living things based on the facts of evolutionary history, which we can all agree on in the limit of good science, unlike the vagaries of what I personally think tastes good on pizza.

When you think about it, it makes sense that our shared language ends up being optimized for robustly describing reality, rather than catering to the ignorance of people who don’t have reasons to care about whether a particular distinction is actually robust. Personally, I confess I don’t know the difference between alligators and crocodiles, and I don’t particularly need to know: I’m not likely to encounter either outside of a zoo or a nature documentary. But precisely because I don’t need to know, you don’t see me demanding that the rest of the world redefine one of these words as a hypernym that includes both. The people who write encyclopedias seem to think there’s a difference, and since they probably know what they’re doing, it makes sense for their opinion to have more weight on English language common usage than mine—at least until I were to start regularly ending up in situations where I need to point to an alligator-or-crocodile in my environment and I still didn’t notice any differences.

Some animals that I do see in my local environment sometimes are cats and dogs, because people often keep them as pets. I benefit from having separate words (in my map) for cats and dogs, because I can see that cats and dogs are actually different (in the territory). If my pen pal from a faraway land that had no cats were to visit America and encounter a cat for the first time, he might remark, “What a strange dog!” If I were to reply, “Actually, that’s a cat; they’re not the same thing as dogs”, it would be pretty obnoxious if he were to snap back, “What kind of definitional gymnastics is this? It’s a four-legged furry animal with a tail! As far as I’m concerned, it’s a dog.”

It’s true that dogs and cats are both four-legged furry animal with a tail. If you had never seen a cat before, or you didn’t spend much time around four-legged furry tailed animals at all, it might not be immediately obvious why someone might want to allocate two words for these subcategories, or why anyone might oppose just using dog to refer to the supercategory. And yet there’s some sense in which my countrymen who think cats and dogs are different things know what they’re doing. My “Actually, that’s a cat” claim represented an attempt to convey information about the statistical structure of creatures in the real world, and my foreign friend’s insistence that he can define a word any way he wants—to suit his ignorance, to avoid challenges to his current ontology—functions to shut down that transfer of information.

But if you don’t know what a better ontology can buy you—if you don’t know that there are mathematical laws governing the use of categories in a rational mind—you may not know what you’re missing. As part of a review of a book on post-traumatic stress disorder, psychiatrist Scott Alexander casually mentions the American Psychiatric Association’s “philosophical commitment to categorizing by symptoms rather than cause”: “[w]hen the APA decides not to [recognize developmental trauma disorder], they’re not necessarily rejecting the seriousness of child abuse, only saying it’s not the kind of thing they build their categories around.”

In a sane world, this would be utterly discrediting to the APA. The cognitive function of categories is to group relevantly similar things together in order to make similar predictions and decisions about them. But for the decisions involved in treating a condition, causes are of supreme relevance! Medical doctors understand this: we consider bacterial and viral infections to be different categories of disease even when they cause similar symptoms, because antibiotics can treat the former but not the latter. No matter what words are used to describe it, at some point your decision algorithm needs to categorize by cause in order to compute the correct treatment: for example, to give antibiotics to the patients with bacterial diseases and antivirals to the patients with viral diseases. If the authoritative body of professional psychiatrists has a “philosophical commitment” against this, that means we don’t have a science of psychiatry.

In short, if you care about making high-quality decisions, mechanisms matter and causality matters, and mechanisms and causality aren’t necessarily pinned down by whatever particular high-level surface analogy happens to seem most salient to a particular human.

The one replies:

Okay, you’ve convinced me that phylogenetics is—potentially—of more than just specialist interest. But “fish” are a paraphyletic category: descended from a common ancestor, but not including all the descendant groups—in this case, excluding the tetrapods (amphibians, reptiles, mammals, birds, &c.). If you’ve decided that you want to use phylogeny as the basis for your definitions, shouldn’t you have the courage of your convictions and only admit monophyletic clades that include all descendants of a common ancestor?

But it’s not that we’ve “decided” that we “want” to define animal words based on phylogeny. Definitions are uninteresting; you can’t change reality by choosing a different definition! When we find structure in the distribution of animals in the world, and we want to come up with a “definition” of a category in order to efficiently point to the structure to someone who doesn’t already know what the words in our language refer to, we’re likely to end up talking about phylogenetics as a convenience, because the creatures that are actually all-around similar are actually related to each other for non-accidental reasons. But there is no principle that it would be hypocritical to betray, that definitions need to be monophyletic clades.

It’s true that paraphyletic groups like fish are evolutionary non-events: there’s no inherited feature that all fish share, that isn’t also shared by the tetrapods. That doesn’t mean we somehow can’t or shouldn’t talk about fish! Paraphyletic categories—descendants of a common ancestor, but excluding one or more monophyletic groups—can make sense when the excluded groups have picked up some salient features not shared by the other “branches” of the family. Tetrapods picked up a lot of adaptations specific to living on land; it’s not crazy to want to talk about their cousins that didn’t do that, even if that means that some fish are more recently related to some tetrapods than they are to some other fish.

Noticing the relevance of evolutionary relatedness to optimal categorization doesn’t mean being slavishly committed to taking “years since last common ancestor” as our only criterion for which creatures are relevantly similar. “Years since last common ancestor” correlates with overall similarity, all other things being equal, but sometimes not all other things are equal, and people who aren’t committed to the fallacy that words need to have a simple definition can take the other things into account.

If someone handed you a phylogenetic tree diagram of the development of life on some alien planet, and the diagram was only labeled with years and species names, without any other information about these alien creatures, you wouldn’t have enough information to “carve it at the joints”. You wouldn’t spontaneously invent a paraphyletic grouping—but you also also wouldn’t know which monophyletic groups are most significant.

In contrast, when classifying life on Earth, we’re not in the position of making arbitrary cuts on an unlabeled tree diagram; rather, it’s only after thousands of person-years of studying the natural world that people were able to infer things about evolutionary history and discover the the correct diagram.

It shouldn’t be that surprising that the distinctions we notice in the natural world are both tied to the evolutionary history, but also don’t always correspond to monophyletic clades. The continuity postulate in the evolutionary worldview imposes the desideratum that good categories should at least be a connected set on “phylogenetic space”, not that we should never want to talk about “this clade, except for these few sub-clades that picked up a lot of important differences” as a category of interest—especially when talking about present-day creatures. (We talk about “last common ancestors”, but no one has seen such creatures that lived millions of years ago; everything but the very leaves of the phylogenetic tree are inferred, not observed.)

The claim that dolphins shouldn’t be considered “fish” because the alleged “courage of our convictions” should make us disdain paraphyletic categories only makes sense as an attempted reductio ad absurdum, not as a consistent argument on its own terms: putting dolphins and fish together would be polyphyletic! That’s even worse! But as has just been explained, the reductio fails because the alleged principle being allegedly violated was never actually a principle of category formulation.

You know what else are paraphyletic taxa? Monkeys (excludes apes, even though the common ancestor of monkeys and apes was a monkey). Reptiles (excludes birds, even though the common ancestor of birds was a reptile). Protists (excludes animals, plants, and fungi, even though their common ancestor would have been a protist). Prokaryotes (excludes eukaryotes, even though the common ancestor of eukaryotes would have been a prokaryote). These are pretty commonsensical categories that it makes sense to have words for! But because of the continuity of evolution, it’s not a coincidence that these commonsensical categories that people want words for ended up being connected sets in phylogenetic space.

The one replies:

Not all of them did, though! “Fish” used to just mean the swimmy animals: in the Bible, Jonah was swallowed by a “great fish”, thought to be a whale. It was only after we figured genealogy that some pedants decided that whales didn’t count.

But the claim that the distinction between fish and cetaceans (dolphins and whales) was only recognized after their differing evolutionary histories were discovered is just false to historical fact. Aristotle, writing in the fourth century BCE, already distinguished cetaceans from fish (“Very extensive genera of animals, into which other subdivisions fall, are the following: one, of birds; one, of fishes; and another, of cetaceans”). Aristotle was not being a phylogenetics pedant, because Aristotle did not know about evolution! He actually noticed the differences!

The pattern generalizes. Some determined contrarians might be inclined to argue “bats are birds” (flappy flying animals) on the same grounds as “dolphins are fish” (flappy swimmy animals). But did you know the German word for bat is Fledermaus (“flutter mouse”), which dates back to fledarmūs in Old High German? Apparently, people way back in the tenth century or so (also long before evolution was understood) already thought bats were like a mammal-that-happened-to-fly rather than a bird-that-happened-to-be-furry.

Similarly, we recognize ostriches and penguins as birds on the basis of overall similarity, even though they don’t fly (although we may sometimes qualify them as “flightless birds”, in recognition of the fact that most birds fly). It would seem that “flappy flying animals” is not the common usage meaning of bird.

To be sure, convergent evolution is a thing, such that sometimes we might want short codewords that point to the cluster-structure-produced-by-convergent-evolution rather than the conditional-independence-structure-produced-by-connectedness-in-phylogenetic-space—trees, and possibly crabs, are a case in point. But it’s important to notice the difference—to see through to the inferences your concepts are buying you—and what gets lost when you try to reason in a domain where your concept falls apart.

The power to define concepts is the power to delimit thought, to determine what kinds of inferences are easily representable. Finding the right concepts to explain and control the world we see is a fundamentally empirical challenge, a scientific challenge—to see the difference between things that seem similar and to see the similarities between things which seem different.

But although the quest is an empirical one—something that can only be achieved by studying what’s out there, not just by writing blog posts about philosophy—it turns out that a little bit of philosophy is necessary to ground the rules of the investigation. Not much. Just the basics. The map–territory distinction. Probability, clustering. Conditional independence.

Maybe someday it could be possible to have a real science of psychiatry that reflects the actual structure of the mind, instead of doing the equivalent of lumping sprains and stress fractures together as hurtankles. Maybe even greater achievements are possible. Personally, I’m not optimistic about humanity’s prospects.

I’m sure of one thing, though. If there is a better world out there, a way to unlock the secrets of the universe and wield them in the service of our values, it’s only possible if we stop playing nitwit games and admit that dolphins don’t belong on the fish list.

(Thanks to Tailcalled for the “root of the causal graph” observation and John S. Wentworth for explaining the importance of conditional independence.)