The Cluster Structure of Thingspace

The no­tion of a “con­figu­ra­tion space” is a way of trans­lat­ing ob­ject de­scrip­tions into ob­ject po­si­tions. It may seem like blue is “closer” to blue-green than to red, but how much closer? It’s hard to an­swer that ques­tion by just star­ing at the col­ors. But it helps to know that the (pro­por­tional) color co­or­di­nates in RGB are 0:0:5, 0:3:2 and 5:0:0. It would be even clearer if plot­ted on a 3D graph.

In the same way, you can see a robin as a robin—brown tail, red breast, stan­dard robin shape, max­i­mum fly­ing speed when un­laden, its species-typ­i­cal DNA and in­di­vi­d­ual alle­les. Or you could see a robin as a sin­gle point in a con­figu­ra­tion space whose di­men­sions de­scribed ev­ery­thing we knew, or could know, about the robin.

A robin is big­ger than a virus, and smaller than an air­craft car­rier—that might be the “vol­ume” di­men­sion. Like­wise a robin weighs more than a hy­dro­gen atom, and less than a galaxy; that might be the “mass” di­men­sion. Differ­ent robins will have strong cor­re­la­tions be­tween “vol­ume” and “mass”, so the robin-points will be lined up in a fairly lin­ear string, in those two di­men­sions—but the cor­re­la­tion won’t be ex­act, so we do need two sep­a­rate di­men­sions.

This is the benefit of view­ing robins as points in space: You couldn’t see the lin­ear lineup as eas­ily if you were just imag­in­ing the robins as cute lit­tle wing-flap­ping crea­tures.

A robin’s DNA is a highly mul­ti­di­men­sional vari­able, but you can still think of it as part of a robin’s lo­ca­tion in thingspace—mil­lions of qua­ter­nary co­or­di­nates, one co­or­di­nate for each DNA base—or maybe a more so­phis­ti­cated view that . The shape of the robin, and its color (sur­face re­flec­tance), you can like­wise think of as part of the robin’s po­si­tion in thingspace, even though they aren’t sin­gle di­men­sions.

Just like the co­or­di­nate point 0:0:5 con­tains the same in­for­ma­tion as the ac­tual HTML color blue, we shouldn’t ac­tu­ally lose in­for­ma­tion when we see robins as points in space. We be­lieve the same state­ment about the robin’s mass whether we vi­su­al­ize a robin bal­anc­ing the scales op­po­site a 0.07-kilo­gram weight, or a robin-point with a mass-co­or­di­nate of +70.

We can even imag­ine a con­figu­ra­tion space with one or more di­men­sions for ev­ery dis­tinct char­ac­ter­is­tic of an ob­ject, so that the po­si­tion of an ob­ject’s point in this space cor­re­sponds to all the in­for­ma­tion in the real ob­ject it­self. Rather re­dun­dantly rep­re­sented, too—di­men­sions would in­clude the mass, the vol­ume, and the den­sity.

If you think that’s ex­trav­a­gant, quan­tum physi­cists use an in­finite-di­men­sional con­figu­ra­tion space, and a sin­gle point in that space de­scribes the lo­ca­tion of ev­ery par­ti­cle in the uni­verse. So we’re ac­tu­ally be­ing com­par­a­tively con­ser­va­tive in our vi­su­al­iza­tion of thingspace—a point in thingspace de­scribes just one ob­ject, not the en­tire uni­verse.

If we’re not sure of the robin’s ex­act mass and vol­ume, then we can think of a lit­tle cloud in thingspace, a vol­ume of un­cer­tainty, within which the robin might be. The den­sity of the cloud is the den­sity of our be­lief that the robin has that par­tic­u­lar mass and vol­ume. If you’re more sure of the robin’s den­sity than of its mass and vol­ume, your prob­a­bil­ity-cloud will be highly con­cen­trated in the den­sity di­men­sion, and con­cen­trated around a slant­ing line in the sub­space of mass/​vol­ume. (In­deed, the cloud here is ac­tu­ally a sur­face, be­cause of the re­la­tion VD = M.)

“Ra­dial cat­e­gories” are how cog­ni­tive psy­chol­o­gists de­scribe the non-Aris­totelian bound­aries of words. The cen­tral “mother” con­ceives her child, gives birth to it, and sup­ports it. Is an egg donor who never sees her child a mother? She is the “ge­netic mother”. What about a woman who is im­planted with a for­eign em­bryo and bears it to term? She is a “sur­ro­gate mother”. And the woman who raises a child that isn’t hers ge­net­i­cally? Why, she’s an “adop­tive mother”. The Aris­totelian syl­l­o­gism would run, “Hu­mans have ten fingers, Fred has nine fingers, there­fore Fred is not a hu­man” but the way we ac­tu­ally think is “Hu­mans have ten fingers, Fred is a hu­man, there­fore Fred is a ‘nine-fin­gered hu­man’.”

We can think about the ra­dial-ness of cat­e­gories in in­ten­sional terms, as de­scribed above—prop­er­ties that are usu­ally pre­sent, but op­tion­ally ab­sent. If we thought about the in­ten­sion of the word “mother”, it might be like a dis­tributed glow in thingspace, a glow whose in­ten­sity matches the de­gree to which that vol­ume of thingspace matches the cat­e­gory “mother”. The glow is con­cen­trated in the cen­ter of ge­net­ics and birth and child-rais­ing; the vol­ume of egg donors would also glow, but less brightly.

Or we can think about the ra­dial-ness of cat­e­gories ex­ten­sion­ally. Sup­pose we mapped all the birds in the world into thingspace, us­ing a dis­tance met­ric that cor­re­sponds as well as pos­si­ble to per­ceived similar­ity in hu­mans: A robin is more similar to an­other robin, than ei­ther is similar to a pi­geon, but robins and pi­geons are all more similar to each other than ei­ther is to a pen­guin, etcetera.

Then the cen­ter of all bird­ness would be densely pop­u­lated by many neigh­bor­ing tight clusters, robins and spar­rows and ca­naries and pi­geons and many other species. Ea­gles and fal­cons and other large preda­tory birds would oc­cupy a nearby cluster. Pen­guins would be in a more dis­tant cluster, and like­wise chick­ens and os­triches.

The re­sult might look, in­deed, some­thing like an as­tro­nom­i­cal cluster: many galax­ies or­bit­ing the cen­ter, and a few out­liers.

Or we could think si­mul­ta­neously about both the in­ten­sion of the cog­ni­tive cat­e­gory “bird”, and its ex­ten­sion in real-world birds: The cen­tral clusters of robins and spar­rows glow­ing brightly with highly typ­i­cal bird­ness; satel­lite clusters of os­triches and pen­guins glow­ing more dimly with atyp­i­cal bird­ness, and Abra­ham Lin­coln a few mega­parsecs away and glow­ing not at all.

I pre­fer that last vi­su­al­iza­tion—the glow­ing points—be­cause as I see it, the struc­ture of the cog­ni­tive in­ten­sion fol­lowed from the ex­ten­sional cluster struc­ture. First came the struc­ture-in-the-world, the em­piri­cal dis­tri­bu­tion of birds over thingspace; then, by ob­serv­ing it, we formed a cat­e­gory whose in­ten­sional glow roughly over­lays this struc­ture.

This gives us yet an­other view of why words are not Aris­totelian classes: the em­piri­cal clus­tered struc­ture of the real uni­verse is not so crys­tal­line. A nat­u­ral cluster, a group of things highly similar to each other, may have no set of nec­es­sary and suffi­cient prop­er­ties—no set of char­ac­ter­is­tics that all group mem­bers have, and no non-mem­bers have.

But even if a cat­e­gory is ir­recov­er­ably blurry and bumpy, there’s no need to panic. I would not ob­ject if some­one said that birds are “feathered fly­ing things”. But pen­guins don’t fly!—well, fine. The usual rule has an ex­cep­tion; it’s not the end of the world. Defi­ni­tions can’t be ex­pected to ex­actly match the em­piri­cal struc­ture of thingspace in any event, be­cause the map is smaller and much less com­pli­cated than the ter­ri­tory. The point of the defi­ni­tion “feathered fly­ing things” is to lead the listener to the bird cluster, not to give a to­tal de­scrip­tion of ev­ery ex­ist­ing bird down to the molec­u­lar level.

When you draw a bound­ary around a group of ex­ten­sional points em­piri­cally clus­tered in thingspace, you may find at least one ex­cep­tion to ev­ery sim­ple in­ten­sional rule you can in­vent.

But if a defi­ni­tion works well enough in prac­tice to point out the in­tended em­piri­cal cluster, ob­ject­ing to it may justly be called “nit­pick­ing”.