# Categories: models of models

Let me clar­ify what I mean when I say that math con­sists of nouns and verbs. Think about el­e­men­tary school math­e­mat­ics like ad­di­tion and sub­trac­tion. What you learn to do is take a bunch of nouns—1, 2, 3, etc.—and a bunch of verbs—ad­di­tion, sub­trac­tion—and make sen­tences. “1 + 2 = 3.”

When you make a sen­tence like that, what you’re do­ing is tak­ing an ob­ject, 1, and ob­serv­ing how it changes when it in­ter­acts—speci­fi­cally, adds—with an­other ob­ject, 2. You ob­serve that be­comes a 3. Just like how you can ob­serve a per­son (ob­ject) bump their head (in­ter­ac­tion) into a wall (other ob­ject) and the cor­re­spond­ing change (equals sign): the bluish bump emerg­ing on their fore­head.

Well, it turns out that no mat­ter how far you go in math, you’re still tak­ing ob­jects and ob­serv­ing how they change when they in­ter­act with other ob­jects. You learn about other kinds of num­bers like 2.5 and 3.333 re­peat­ing, and other kinds of in­ter­ac­tions like mul­ti­pli­ca­tion and di­vi­sion.

Even­tu­ally things start get­ting more ab­stract. You learn about ma­tri­ces and the kinds of in­ter­ac­tions they can have. You learn about sets and func­tions in a deep way. Com­plex num­bers, topolo­gies, etc.

But you never get away from nouns and verbs.

What are we go­ing to do with all of these nouns and verbs? Well, at some point, when you have a bunch of things that feel similar in an im­por­tant way, you of­ten want to take a step back, lump all of the in­di­vi­d­ual things to­gether, and talk about them in gen­eral. Why do we have the con­cept of fruit? Be­cause we live in a world where ap­ples, or­anges, lemons, pa­payas, etc., ex­ist.

Why do we have the con­cept of num­bers? Be­cause we have a bunch of them, not just 1, 2, and 3, but also 2.5, 3.333 re­peat­ing, ir­ra­tional num­bers, and even com­plex num­bers. Why do we have the con­cept of “op­er­a­tions?” Be­cause we have ad­di­tion, sub­trac­tion, mul­ti­pli­ca­tion, and di­vi­sion. Even though all of these give you very differ­ent an­swers for the same in­put of num­bers, (1 + 2 does not equal 1 − 2 does not equal 1 times 2 does not equal 1 di­vided by 2), they still have some­thing very similar in com­mon, so it’s worth com­ing up with the con­cept of “op­er­a­tion” to study them in­de­pen­dently of their in­di­vi­d­ual char­ac­ter­is­tics. Just like how we can talk about “fruit” with­out hav­ing to refer­ence the shape of an ap­ple.

If you lived in a world with only ap­ples, you wouldn’t have the con­cept of fruit. If you lived in a world where the only kind of thing was a rock, you wouldn’t have the con­cept of nouns. If you lived in a world where the only kind of math-thing was num­bers, cat­e­gory the­o­rists wouldn’t have come up with the con­cept of ob­jects. (They wouldn’t have come up with cat­e­gory the­ory at all!)

And what is the use of gen­er­al­iza­tion? A bird’s-eye view of some­thing lets you see things from new per­spec­tives, po­ten­tially chang­ing your con­cept of the thing en­tirely. When you think of num­bers as var­i­ous ways you can count stuff on your fingers, you can inch your way past the nat­u­ral num­bers to nega­tives, frac­tions and ir­ra­tional num­bers. But com­plex num­bers will throw you for a to­tal loop—they just don’t seem to be re­lat­able to “amounts of things” in a way that you’re ever likely to see in the world.

In fact, com­plex num­bers made no sense to me un­til I learned what num­bers ac­tu­ally are, at which point ev­ery­thing clicked into place. The gen­er­al­iza­tion helped me un­der­stand the spe­cific case—com­plex num­bers are just an­other type of num­ber, like how ap­ples are just an­other type of fruit.

Gen­er­al­iza­tion is helpful when life stops throw­ing con­ve­nient ex­am­ples at you.

For ex­am­ple, say you want to grow new kinds of fruit that have never ex­isted. Hav­ing a con­cept of fruit is nec­es­sary to con­ceiv­ing of that idea. Life’s not go­ing to give you ex­am­ples of fruit that have never ex­isted! You have to ex­plore the con­cep­tual space of all fruit.

(In fact, this ex­pe­rience with com­plex num­bers, years ago, prob­a­bly in­spired these posts. I got the idea that if you just both­ered to take a re­ally long time to ex­plain the math­e­mat­ics, the av­er­age per­son could prob­a­bly grasp quan­tum physics. This se­ries is the cat­e­gory-the­ory ver­sion of that origi­nal idea.)

Cat­e­gory the­ory ex­ists for the same rea­son the con­cept of fruit does: there are lots of in­di­vi­d­ual things that have cer­tain com­mon­al­ities that make gen­er­al­iza­tion an ap­peal­ing idea. Ap­ples and or­anges are very differ­ent on the sur­face and very similar deep down. Nat­u­ral num­bers and com­plex num­bers are very differ­ent on the sur­face and very similar deep down.

Cat­e­gory the­ory goes one step wider. It emerges when peo­ple look at en­tire fields of math like “alge­bra” and “topol­ogy” and no­tices that, while they’re very differ­ent on the sur­face, they seem to be very similar deep down. Many other fields of math­e­mat­ics seemed to also share these deep similar­i­ties, and so grad­u­ally they all be­came mere ex­am­ples of the gen­er­al­iza­tion, that gen­er­al­iza­tion be­ing a cat­e­gory. (A cat­e­gory be­ing some­thing we’ll define by the end of this post.) Just like how ap­ples and or­anges be­come mere ex­am­ples of the gen­er­al­iza­tion that is “fruit.”

And one of those com­mon­al­ities is that all of these su­perfi­cially dis­parate fields of math­e­mat­ics study things and in­ter­ac­tions be­tween those things. I.e., they study ob­jects and mor­phisms.

That might sound re­ally gen­eral. And it is! And yet, just like with the gen­eral defi­ni­tion of fields that lets us un­der­stand com­plex num­bers, we can learn re­ally in­ter­est­ing things from this su­per-gen­eral per­spec­tive. (Speci­fi­cally, the Yoneda lemma and ad­junc­tion.)

But right now you are hav­ing to take my word for the idea that many differ­ent fields of math can be thought of as study­ing “nouns and verbs.” So let’s look at things from a differ­ent per­spec­tive.

Even if you don’t know higher maths, you prob­a­bly know things like, “pour­ing milk in my ce­real will make my ce­real soggy.”

So “milk + ce­real = soggy ce­real.” Seems awfully...math­e­mat­i­cal.

Why does math ex­ist, any­way? Well, there’s lots of ways to an­swer that ques­tion, so let’s rephrase it: why do math­e­mat­i­ci­ans ex­ist? Or even bet­ter, why do math­e­mat­i­ci­ans get paid? It cer­tainly isn’t for the joy of do­ing math. In­stead, “math­e­mat­i­cian” is a job that you can ac­tu­ally get paid to do be­cause math is very use­ful for mod­el­ing our re­al­ity.

So why does math boil down to ob­jects and mor­phisms so of­ten? Prob­a­bly for the same rea­son English boils down to nouns and verbs: we use lan­guage to dis­cuss re­al­ity, and re­al­ity seems to boil down to nouns and verbs.

Take birds, for ex­am­ple. They are birds, so they’re nouns (ob­jects). And they do stuff like fly, tweet, lay eggs, eat, etc. I.e., verbs (mor­phisms).

What­ever you may or may not know of maths, you definitely know a thing or two about re­al­ity. You’ve been liv­ing in it your whole life.

The rest of this post will take com­mon sense ideas about cre­at­ing mod­els of our re­al­ity, break them down into their most ab­stract com­po­nents, and end up with some sim­ple rules that any model should fol­low. We’ll dis­cover that those com­po­nents and rules are ex­actly the com­po­nents and rules that define a cat­e­gory.

***

You know a lot of mod­els, even if most of them would never make it into a text­book. You use these mod­els to nav­i­gate re­al­ity.

For ex­am­ple, a sen­tence like “Alice pushes Bob” is a model of re­al­ity. By them­selves, those let­ters in that or­der are mean­ingless. But be­cause you can use them to make pre­dic­tions (speci­fi­cally, you’re in­fer­ring the speaker’s state of mind), the sen­tence is cor­re­spond­ingly a model of (that par­tic­u­lar part of) re­al­ity.

Sen­tences them­selves ex­ist, so we can model them too. You can think of a sen­tence struc­ture as a way of mod­el­ing a sen­tence. As for mod­els them­selves, a model is a way of ab­stract­ing away from some de­tails of what­ever you’re study­ing so that other fea­tures be­come more salient. For ex­am­ple, you can model a spe­cific cat, Mr. Peanuts, as just a cat, ne­glect­ing spe­cific fea­tures like his name, his color, etc., so that other, more gen­eral fea­tures of cats, like the ten­dency to meow, eat tuna fish, and ut­terly de­spise hu­mans be­come more salient.

That is to say, what will Mr. Peanuts do in [spe­cific situ­a­tion]? Hard to say un­less you know him per­son­ally. But if we ask, “What will a cat do in [spe­cific situ­a­tion]?” you might have a de­cent guess. The cat’s not go­ing to start bark­ing, for one.

You have a model of cats in­side your head, and this lets you make pre­dic­tions, at the ex­pense of speci­fic­ity.

Sen­tence struc­ture mod­els spe­cific sen­tences like “Alice pushes Bob” in terms of their ab­stract com­po­nents: “Noun verbs noun.”

You might be sur­prised to learn that you can make pre­dic­tions with sen­tence struc­ture. For ex­am­ple, one of the rules of gram­mar is that a prepo­si­tional phrase ends with a noun. So let’s say I ask you, “Who will the soc­cer player kick the ball to?” You don’t know the an­swer—I haven’t even given you a list of peo­ple to choose from.

Let’s also say that you don’t know what the word “who” means, so you don’t even know what I’m ask­ing. In fact, let’s say you don’t know what any of the words in the sen­tence means, only their parts of speech. You cer­tainly aren’t go­ing to give the spe­cific an­swer I’m look­ing for.

But you do know sen­tence struc­ture! You know that “to” is a prepo­si­tion, so the phrase it opens up must end with a noun.

So who is the soc­cer player go­ing to kick the ball to? An­swer: a per­son, place, or thing—a noun.

This is not a brilli­ant an­swer. But it is an an­swer—a cor­rect one, in fact! This su­per ab­stract model of sen­tences let you figure things out even when you didn’t know what any of the spe­cific words meant.

So sen­tences are mod­els, and sen­tence struc­ture is a model of a model, and mod­els are re­ally use­ful: they let you pre­dict things.

But sen­tence struc­ture is a model of just one kind of model—sen­tences them­selves. What if we wanted a model of all kinds of differ­ent mod­els—sen­tences, sen­tence struc­ture, sci­en­tific mod­els, mod­els from sen­sory data, math­e­mat­i­cal mod­els, ev­ery­thing? What if we wanted to model mod­el­ing it­self?

Why, we’d turn to cat­e­gory the­ory.

So what are the gen­eral qual­ities of all mod­els?

First of all, ev­ery model has things—ob­jects—and ac­tions that the ob­jects to do other ob­jects—mor­phisms. For ex­am­ple, you can model a per­son by see­ing them. That is to say, you hit them with some pho­tons and form a pic­ture of them based on the re­sult. Every model is ul­ti­mately found in how one ob­ject changes an­other ob­ject—hence the term “mor­phism.”

(No­tice how sen­tences, which them­selves boil down to nouns and verbs, can be thought of as mod­els of any­thing. After all, it’s hard to imag­ine any kind of phe­nomenon that can’t be ex­pressed in a sen­tence, even if you have to make up some new words to do it. Effec­tively, the rea­son we use cat­e­gory the­ory to form mod­els of mod­els of ev­ery­thing in­stead just stick­ing with English is be­cause we can define our cat­e­gory as obey­ing cer­tain rules or ax­ioms that cor­re­spond to how mod­els ought to work, while English can say pretty much any­thing. The rest of this post dis­cusses these rules.)

So cat­e­gories—mod­els of mod­els—con­sist of ob­jects and mor­phisms. But what are the rules we re­quire our ob­jects and mor­phisms to obey so that they match up with our ideas of how mod­els of re­al­ity ought to work? One is that ev­ery ob­ject ought to be, in some sense, a perfect model of it­self.

Think about what mod­els do: they trans­form how we view ob­jects. If I tell you some­one is a mur­derer, this changes how you view them be­cause you ap­ply a differ­ent model. Loosely, you could say that you ap­ply the “mur­derer” mor­phism to the per­son in ques­tion. Well, ev­ery ob­ject ought to have a model that you can ap­ply that doesn’t trans­form the ob­ject at all—not be­cause you aren’t trans­form­ing, but be­cause there’s no effect, like mul­ti­ply­ing by 1. For ex­am­ple, if you ap­ply the “Mr. Peanuts” mor­phism to Mr. Peanuts, this shouldn’t change any­thing: Mr. Peanuts is already Mr. Peanuts. This mor­phism is, in a cer­tain clear sense, the iden­tity of Mr. Peanuts.

So per­haps un­sur­pris­ingly, in cat­e­gory the­ory, we say that ev­ery ob­ject has an iden­tity mor­phism that “does noth­ing.” For an ar­bi­trary ob­ject its iden­tity mor­phism is writ­ten That’s be­cause it’s like mul­ti­ply­ing by 1—you are mul­ti­ply­ing, it’s just that noth­ing changes. (But just like mul­ti­ply­ing both sides by 1 helps you solve equa­tions, the iden­tity mor­phism is ex­tremely use­ful.)

What does it ac­tu­ally mean, math­e­mat­i­cally, for the iden­tity mor­phism to “do noth­ing?” To an­swer that, let’s look at an­other re­quire­ment we’d ex­pect of a gen­eral model of mod­els: com­po­si­tion­al­ity.

Say Alice pushes Bob, and Bob bumps into Charles. Let’s di­a­gram this as Hope­fully the in­ter­pre­ta­tion of this di­a­gram is ob­vi­ous. We could say that Alice af­fects Bob, and Bob af­fects Charles—that is to say, Bob’s state is a model of Alice’s ac­tion on Bob, and Charles’s state is a model of Bob’s ac­tion on Charles. (I know we used to stand for push last time, but and are the typ­i­cal “generic mor­phism” sym­bols in cat­e­gory the­ory, and it’s bet­ter to get used to us­ing generic sym­bols than try­ing to name them af­ter things all the time. Here, is Alice push­ing Bob, and is Bob bump­ing into Charles.)

But we also might have an in­tu­ition that Alice af­fects Charles. Sure, she does so in­di­rectly through Bob, but clearly Charles’s state is a model of Alice’s ac­tions—ac­tions which are in some sense ac­tions on Charles.

That is to say, Alice pushed Charles just as clearly as she pushed Bob.

Effect flows through. We’re all just mod­els of the origi­nal state and laws of the uni­verse, what­ever they were. So when­ever we have some­thing like we should re­ally ex­pect to also have some­thing like More­over, this mor­phism should be equal to what hap­pened when Alice pushed Bob and then Bob bumped into Charles. After all, that is what Alice did to Charles.

Let’s use the sym­bol to mean “fol­low­ing.” So if we write , that reads as “g fol­low­ing f.” (Yes, that means you read it right to left, in effect. Get used to it, ev­ery field has at least one coun­ter­in­tu­itive piece of no­ta­tion.) We would then say that . More to the point, the idea is that when­ever you have some­thing that looks like , re­gard­less of what these ob­jects and mor­phisms are meant to stand in for, then you nec­es­sar­ily have such that .

This is what com­po­si­tion looks like as a di­a­gram. (The back­wards E means “there ex­ists,” and the dot­ted line, re­dun­dantly, means the same thing.)

An ex­am­ple of that you are hope­fully fa­mil­iar with is func­tion com­po­si­tion. (We’ll cover it in an­other post any­way.) Say you have You should know how to eval­u­ate this: first, square the , then mul­ti­ply it by .

Let’s see how this is ex­actly like what we did be­fore. You can see it like this: say that stands for the ac­tion of squar­ing the (it’s a trans­for­ma­tion—a mor­phism). And the other mor­phism stands for mul­ti­ply­ing it by .

If you’re given the prob­lem of de­ter­min­ing what you have if you do fol­low­ing , you can write that like . Plug­ging in the ac­tual equa­tions, you are told to “mul­ti­ply by fol­low­ing squar­ing .” Which is ex­actly what you did.

If you were told to de­ter­mine in­stead, you should be able to see that this tells you to “square fol­low­ing mul­ti­ply­ing by .” I.e., .

The sym­bol in fact tells you that you are com­pos­ing one mor­phism with an­other (ex­actly how com­po­si­tion works is defined by the cat­e­gory), and the re­quire­ment that im­plies such that is called com­po­si­tion­al­ity. Every cat­e­gory has a rule of com­po­si­tion.

Com­po­si­tion is the real ac­tion of any cat­e­gory. If you know the rule of com­po­si­tion, you know how the cat­e­gory works in­ter­nally. (Think about the con­cept of the laws of physics, the view of the world as a chain of events such that we can pre­dict our cur­rent state from the Big Bang. Un­der­stand­ing that rule of com­po­si­tion is the the­ory of ev­ery­thing, the holy grail of physics.)

Now that we know what com­po­si­tion is, we can use it to rigor­ously define how the iden­tity mor­phism “does noth­ing.” Sup­pose you have an ob­ject and the iden­tity mor­phism . Say you also have a mor­phism . Since is an ob­ject, it has an iden­tity mor­phism as well, .

The iden­tity mor­phism is a mor­phism “on “—what does that mean? It means that goes from to . It goes nowhere, in effect—just what we’d ex­pect of an iden­tity mor­phism.

So we have a mor­phism . We also have a mor­phism , as stated. Well, look at this! Ac­cord­ing to the rule of com­po­si­tion, we must have a mor­phism such that .

But we already had a mor­phism go­ing from to , it­self. The iden­tity mor­phism does noth­ing, so it shouldn’t have any effect on what we already had. This is pre­cisely defined by the fol­low­ing con­di­tion:

That is to say, do­ing af­ter do­ing is the same as just do­ing . Com­pos­ing by is just like mul­ti­ply­ing by .

And no­tice we also have . And since we still have , we again have a rule of com­po­si­tion say­ing that we have a mor­phism go­ing from to that is equal to .

What is this mor­phism? Again, does noth­ing, mean­ing that .

Ba­si­cally, if you give a nut to a bird, and the nut is the nut, that’s the same as just giv­ing a nut to a bird. Similarly, if you give a nut to a bird, and the bird is the bird, that’s just the same as giv­ing a nut to a bird. Brilli­ant, I know, but some­times math is about piling up triv­ial­ities un­til some­thing profound emerges.

There’s one last rule we’d ex­pect any model to obey. Let’s take again the se­quence of events where Alice pushes Bob, and Bob stum­bles into Charles. Tem­po­rally, Alice pushed Bob be­fore Bob ran into Charles. But log­i­cally, we should be able to study the events out of or­der: we should be able to study Bob’s mo­tion into Charles, and then look back­wards in time to study Alice push­ing Bob. (Just like how we can study Alice push­ing Bob be­fore we look back­wards in time to see how the Big Bang pushed Alice into push­ing Bob.)

This abil­ity to study the events out of or­der as long as ul­ti­mately put things to­gether in the right or­der is called as­so­ci­a­tivity. You might be fa­mil­iar with this in terms of the as­so­ci­a­tivity of mul­ti­pli­ca­tion, for ex­am­ple. Be­cause we say that mul­ti­pli­ca­tion is as­so­ci­a­tive.

Let’s add David into the equa­tion, so that we have a path like We have a new com­po­si­tion . As­so­ci­a­tivity would tell us that

In­deed, in any cat­e­gory, the rule of com­po­si­tion must be as­so­ci­a­tive. This should be in­ter­preted as say­ing that we study at any in­di­vi­d­ual in­ter­ac­tion out of or­der, as long as we don’t for­get what the or­der ac­tu­ally was! (Why “study?” Be­cause we’re mod­el­ing mod­els, and mod­els are for study­ing re­al­ity.)

You could think of as­so­ci­a­tivity this way: Say you’re plan­ning a trip from North Amer­ica to Europe to Asia. As­so­ci­a­tivity says that you can think about the trip from Europe to Asia be­fore you think about the trip from North Amer­ica to Europe so long as you re­mem­ber to leave from North Amer­ica when it’s time to take the trip!

And...we’re done. We have the rules we’d ex­pect any model to obey. And, not co­in­ci­den­tally, we now have the rules we’d ex­pect any model of mod­els to obey—the rules that define a cat­e­gory.

Defi­ni­tion: a cat­e­gory is a math­e­mat­i­cal struc­ture con­sist­ing of

i) a bunch of ob­jects writ­ten etc.

ii) a bunch of mor­phisms writ­ten etc., which obey some rules.

Rule 1: Every mor­phism goes “from” an ob­ject in the cat­e­gory “to” an ob­ject in the cat­e­gory . The “from” ob­ject is called the do­main and the “to” ob­ject is called the codomain. (This should sound similar from learn­ing about func­tions—func­tions are just a type of mor­phism; func­tions have a do­main and codomain be­cause all mor­phisms do.) For ex­am­ple, if you have a mor­phism , then is the do­main of and is the codomain of .

Rule 2: The cat­e­gory must have a rule of com­po­si­tion such that, if the do­main of one mor­phism in is the codomain of an­other mor­phism in then there is a way of com­pos­ing the two mor­phisms to get a sin­gle mor­phism ex­tend­ing from the do­main of the lat­ter mor­phism to the codomain of the former mor­phism such that this new “com­pos­ite” mor­phism is equal to the com­po­si­tion of the lat­ter mor­phism fol­low­ing the former mor­phism. E.g., if you have mor­phisms and , then you nec­es­sar­ily have such that .

Rule 3: Every ob­ject in has an iden­tity mor­phism go­ing from it­self to it­self that does noth­ing. For an ar­bi­trary ob­ject in the iden­tity mor­phism is writ­ten . The iden­tity mor­phism “does noth­ing” in the sense that com­pos­ing it with other mor­phisms is the same as just do­ing the other mor­phism.

Rule 4: Com­po­si­tion is as­so­ci­a­tive. As­so­ci­a­tivity means that if you have a chain of com­po­si­tions , then it is always true that

Why are there so many rules for mor­phisms, and none re­ally for ob­jects? We’ll ex­plore in a few posts from now the idea that cat­e­gories are all about the mor­phisms. Ba­si­cally, we’ll ex­plore the idea that an ob­ject is to­tally defined by what it does and what is done to it—i.e., all the mor­phisms it is a part of.

Com­ing up next are ex­am­ples of cat­e­gories. This next post should hope­fully both clar­ify the claim that cat­e­gory the­ory gen­er­al­izes differ­ent fields of math­e­mat­ics and also help make the con­cept of a cat­e­gory much more con­crete. After­ward, we’ll talk about a very im­por­tant type of cat­e­gory, the cat­e­gory of sets and func­tions, and then a se­ries of posts on both con­cep­tual mat­ters and en­hanc­ing our math­e­mat­i­cal un­der­stand­ing of cat­e­gories.

• I’m re­ally not con­vinced by this fram­ing in terms of “ob­jects do­ing things to other ob­jects”.

Let’s take a typ­i­cal ex­am­ple of a mor­phism: let’s say (note for non-math­e­mat­i­ci­ans: that is, is a func­tion that takes a pos­i­tive in­te­ger and gives you a real num­ber) given by . How is it helpful to think about this as do­ing some­thing to ? How is it even slightly like “Alice pushes Bob”? You say “Every model is ul­ti­mately found in how one ob­ject changes an­other ob­ject”—are you say­ing here that the in­te­gers change the real num­bers? Or vice versa? (After that’s done, what have the in­te­gers or the real num­bers be­come?)

The only thing here that looks to me like some­thing chang­ing some­thing else is that (the mor­phism, not ei­ther of the ob­jects) kinda-sorta “changes” an in­di­vi­d­ual pos­i­tive in­te­ger to which it’s ap­plied (an el­e­ment of one of the ob­jects, again not ei­ther of the ob­jects) by re­plac­ing it with its square root.

But even that much isn’t true for many mor­phisms, be­cause they aren’t all func­tions and the ob­jects of a cat­e­gory don’t always have el­e­ments to “change”. For in­stance, there’s a cat­e­gory whose ob­jects are the pos­i­tive in­te­gers and which has a sin­gle mor­phism from to if and only if ; when we ob­serve that , is 5 chang­ing 9? or 9 chang­ing 5? No, noth­ing is chang­ing any­thing else here.

So far as I can see, the only ac­tual anal­ogy here is with the bare syn­tac­tic struc­ture: you can take “A pushes B” and “A has a mor­phism f to B” and match the pieces up. But the match isn’t very good—the sec­ond of those is a re­ally un­nat­u­ral way of writ­ing it, and re­ally you’d say “f is a mor­phism from A to B”, and the things you can do with mor­phisms and the things you can do with sen­tences don’t have much to do with one an­other. (You can say “A pushes B with a stick”, and “A will push B”, and so forth, and there are no ob­vi­ous cat­e­gory-the­o­retic analogues of these; there’s noth­ing gram­mat­i­cal that re­ally cor­re­sponds to com­po­si­tion of mor­phisms; if A pushes B and B eats C, there re­ally isn’t any way other than that to de­scribe the re­la­tion­ship be­tween A and C, and in­deed most of us wouldn’t con­sider there to be any re­la­tion­ship worth men­tion­ing be­tween A and C in this situ­a­tion.)

• I think it should be pos­si­ble to em­bed images in the post in­stead of just link­ing to them.

• It’s to­tally pos­si­ble. In the post-ed­i­tor, just se­lect some empty space and press the “image” but­ton in the toolbar. Or use mark­down syn­tax.

Happy to edit the above post to have images in the post, as op­posed to just links to images.

• This con­tinues to be a slyly gen­tle se­ries that has you in to some­thing be­fore you know it. Well done!

As a side note, maybe you or the ad­mins can set these posts up as a se­quence so they are linked to­gether.

• Thank you for the pos­i­tive feed­back. (A very un­der­rated thing in terms of en­courag­ing free con­tent pro­duc­tion.) I can go back to each post and add a link to the next one. I am con­cerned that I may want to add, re­ar­range, or even delete in­di­vi­d­ual posts at some point, but I sup­pose that’s no rea­son not to add in the links right now for con­ve­nience’s sake.

• Thanks for this se­quence.

• Con­cep­tual ques­tion:

In real life, i.e., when deal­ing with the phys­i­cal world, there are usu­ally many ways to gen­er­al­ize any given thing or phe­nomenon.

For ex­am­ple, a tomato is a fruit, but it’s also a veg­etable; that is, it be­longs to a botan­i­cal group­ing, but also to a culi­nary group­ing. Nei­ther clas­sifi­ca­tion is more ‘real’ or ‘true’ than the other[1]; and in­deed there are many other pos­si­ble cat­e­gories within which we can put toma­toes (red things, throw­able things, round things, soft things, etc.).

Is this also the case in cat­e­gory the­ory? That is: for any­thing which we might be tempted to gen­er­al­ize with the aid of cat­e­gory the­ory, are there mul­ti­ple ways to gen­er­al­ize it, dic­tated only by con­ve­nience and prefer­ence? Or, is there nec­es­sary some sin­gle canon­i­cal gen­er­al­iza­tion for any given math­e­mat­i­cal… thing? If the former: how and by what crite­ria are gen­er­al­iza­tions se­lected? If the lat­ter: what pit­falls does this cre­ate when us­ing real-world-based analo­gies to un­der­stand cat­e­gory the­ory?

1. Re­call that tax­o­nomic clas­sifi­ca­tions aren’t writ­ten in the heav­ens some­where, but are merely a use­ful way for hu­mans to clas­sify or­ganisms (namely, by putting them into groups ar­ranged by com­mon de­scent). This is use­ful for var­i­ous rea­sons, but by no means un­am­bigu­ous or nec­es­sary, nor dic­tated by re­al­ity—as “in truth there are only atoms and the void.” ↩︎

• Math cer­tainly has am­bigu­ous gen­er­al­iza­tions. As the image hints, these are also stud­ied in cat­e­gory the­ory. Usu­ally, when you must se­lect one, the one of in­ter­est is the least gen­eral one that holds for each of your ob­jects of study. In the image, this is always unique. I’m guess­ing that’s why bi­cen­tric has a name. I’ll pass on the ques­tion of how of­ten this turns out unique in gen­eral.

• One of the rea­sons for my own in­ter­est in cat­e­gory the­ory is my in­ter­est in the ques­tion you raise. I’m hop­ing that we’ll ex­plore the idea that uni­ver­sal prop­er­ties offer an “ob­jec­tive” way of defin­ing “sub­jec­tive” cat­e­gories.

Maybe a more di­rect an­swer is that in the very next post in the se­ries, we’ll see that sets can be con­sid­ered the ob­jects of the cat­e­gory of sets and func­tions, and also the ob­jects of the cat­e­gory of sets and bi­nary re­la­tions. Func­tions are bi­nary re­la­tions, so that’s not a perfect an­swer, but yes, you can think of an in­di­vi­d­ual cat­e­gory as a con­text of sorts through which you view the ob­jects, like how you can view a tomato as a fruit or veg­etable de­pend­ing on the con­text.

• What you learn to do is take a bunch of nouns—1, 2, 3, etc.—and a bunch of verbs—ad­di­tion, sub­trac­tion—and make sen­tences. “1 + 2 = 3.”

I still have no idea how to ex­press this in a pic­ture of ob­jects and ar­rows. I sup­pose that 1, 2, and 3 are ob­jects. Is the ad­di­tion an ar­row? But an ar­row has only one start and one end...

More meta: You have already pro­vided the read­ers “mo­ti­va­tion” in the two in­tro­duc­tory ar­ti­cles. It is not nec­es­sary to add more hype in each ar­ti­cle. Yes, I already heard that you can do ev­ery­thing in cat­e­gory the­ory, and I am will­ing to sus­pend dis­be­lief. Now I am cu­ri­ous how speci­fi­cally it can be done.

• It’s pos­si­ble to con­struct a cat­e­gory where num­bers are ob­jects and where the ar­rows are “plus zero” (iden­tity), “plus one”, “plus two” and so on. (“Num­bers” here might look like it stands in for “nat­u­ral num­bers”. But ac­tu­ally, as de­scribed, it would work just as well with “real num­bers”, “com­plex num­bers”, “in­te­gers greater than three”, “num­bers whose frac­tional part is the same as the frac­tional part of e to five dec­i­mal places”… for­mally, any set which is “closed un­der ad­di­tion of nat­u­ral num­bers”. Un­less you pick a differ­ent way to op­er­a­tional­ize “and so on”.)

Then the ob­jects in “1 + 2 = 3” are in and three, and the ar­row is “plus two”.

(If you picked “num­bers” above to be “nat­u­ral num­bers”, then there’s a one-to-one cor­re­spon­dence be­tween ob­jects and “ar­rows from this ob­ject”, for any ob­ject. But I’m not sure if that’s im­por­tant.)

More nor­mally, “the set of num­bers” would be an ob­ject all by it­self, and the ar­rows would be the same as above, but all point­ing from this one ob­ject to it­self.

Nei­ther of these sounds like what OP was try­ing to de­scribe, but I don’t have an an­swer that does.

• But then there would be no ob­vi­ous con­nec­tion be­tween the num­ber “two” and the ar­row “plus two”. Also, no ob­vi­ous con­nec­tion be­tween the “plus two” ar­row do­ing from 1 to 3, and the “plus two” ar­row go­ing from 6 to 8. That feels like we can make a di­a­gram that some­how rep­re­sents the ad­di­tion of in­te­gers, but we can’t de­rive new in­sights about ad­di­tion from look­ing at the di­a­gram, be­cause most in­for­ma­tion is lost in the trans­la­tion.

I guess what I meant was: I have no idea how to ex­press 1+2=3 in a use­ful pic­ture of ob­jects and ar­rows.

• Know­ing that haskell I think the pat­tern to turn mul­ti­party re­la­tions to two place re­la­tions is R(a,b,c,d,e,f,g) → R(S(b,c,d,e,f,g)) → R(S(T(d,e,f,g)) … R(S(T(U(V(X(Z(g)))))))

The con­nec­tion be­tween “+2“ and 2 would then be a func­tion of +(2)=”+2”. You migth also need =(3)=”=3“ and then you can have =3(+2(2)) = “2+2=3” and maybe a T?(“2+2=3”)=False. In an­other style you would set it up that only true equa­tions could be de­rived. Then one of the find­ings would be that any in­stance of +2(2) could be re­placed with 4 and the map­pings would still hold (atleast on the T? level). Mind you “2+2” could be a differ­ent ob­ject from “4″

• For ex­am­ple, say you want to grow new kinds of fruit that have never ex­isted. Hav­ing a con­cept of fruit is nec­es­sary to con­ceiv­ing of that idea. Life’s not go­ing to give you ex­am­ples of fruit that have never ex­isted! You have to ex­plore the con­cep­tual space of all fruit.

It’s not, ac­tu­ally. See this old com­ment of mine:

Note that un­der this in­ter­pre­ta­tion, no “gen­eral” or “ex­tended” ver­sion of the con­cept is ever cre­ated (the tem­plate is anony­mous, and is dis­carded as soon as it “goes out of scope”—which is to say, as soon as it has been used to cre­ate the new con­cept). There is thus no need to ask the ques­tions of what this new, “gen­eral”/​“ex­tended” con­cept means, to what else it may or may not ap­ply, how to differ­en­ti­ate be­tween uses of it and any spe­cific ver­sion, etc.

• Not ev­ery way to model re­al­ity defines iden­tity and com­po­si­tion. You can start with a cat­e­gory-with­out-those G (a quiver) and end up at a cat­e­gory C by defin­ing C-ar­rows as chains of G-ar­rows (the quiver’s free cat­e­gory), but it doesn’t seem nec­es­sary or a pri­ori likely to give new in­sights. Can you jus­tify this rules choice?

• Hon­estly my real jus­tifi­ca­tion would be “ad­joint func­tors awe­some, and you need cat­e­gories to do ad­joint func­tors, so use cat­e­gories.” More broadly...as long as it’s free to cre­ate a cat­e­gory out of what­ever you’re study­ing, there’s clearly no harm. The ques­tion is whether any­thing’s lost by treat­ing the sub­ject as a cat­e­gory, and while I fully ex­pect that there are en­tire uni­verses of math­e­mat­ics and re­al­ity out there where cat­e­gories are harm­ful, I don’t think we live in one like that. Cat­e­gories may not cap­ture ev­ery­thing you can think of, but they can cap­ture so much that I’d be stunned if they didn’t yield amaz­ing fruit even­tu­ally. I’d ac­knowl­edge that novel, ground­break­ing the­o­rems are still forth­com­ing.

• Let’s take a some­what-con­crete ex­am­ple. Your post men­tions birds. OK, so let’s con­sider e.g. a model of birds fly­ing in a flock, how they po­si­tion them­selves rel­a­tive to one an­other, and so on. You sug­gest that we con­sider the birds as ob­jects: so far, so good. And then you say “they do stuff like fly, tweet, lay eggs, eat, etc. I.e., verbs (mor­phisms).” For the pur­pose of a flock­ing model, the most rele­vant one of those is fly­ing. How are you go­ing to con­sider fly­ing as a mor­phism in a cat­e­gory of birds? If A and B are birds, what is this mor­phism from A to B that rep­re­sents fly­ing? I’m not see­ing how that could work.

In the con­text of a flock­ing model, there are some things in­volv­ing two birds. E.g., one bird might be fol­low­ing an­other, tend­ing to fly to­ward it. Or it might be stay­ing away from an­other, not get­ting too close. Ob­vi­ously you can com­pose these re­la­tions if you want. (You can com­pose any re­la­tions whose types are com­pat­i­ble.) But it’s not ob­vi­ous to me that e.g. “fol­low­ing a bird that stays away from an­other bird” is ac­tu­ally a use­ful no­tion in mod­el­ling flocks of birds. It might turn out to be, but I would ex­pect a num­ber of other no­tions to be more use­ful: you might be in­ter­ested in some sort of cen­tre of mass of a whole flock, or the den­sity of birds in the flock; you might want to con­sider some­thing like a ve­loc­ity field of which the in­di­vi­d­ual birds’ ve­loc­i­ties are sam­ples; etc. None of these things feel very cat­e­gor­i­cal to me (though of course e.g. ve­loc­i­ties live in a vec­tor space and there is a cat­e­gory of vec­tor spaces).

Maybe flock­ing was a bad choice of ex­am­ple. Let’s try an­other: let the birds be hens on a farm, kept for breed­ing and/​or egg-lay­ing. We might want to un­der­stand how much space to give them, what to feed them, when to col­lect their eggs, whether and when to kill them, and so on. Maybe we’re in­ter­ested in op­ti­miz­ing taste or profit or chicken-hap­piness or some com­bi­na­tion of those. So, ac­cord­ing to your origi­nal com­ment, the birds are again ob­jects in a cat­e­gory, and now when they “lay eggs, etc., etc.” these are mor­phisms. What mor­phisms? When a bird lays an egg, what are the two ob­jects the mor­phism goes be­tween? When are we go­ing to com­pose these mor­phisms and what good will it do us?

How does it ac­tu­ally help any­thing to con­sider birds as ob­jects of a cat­e­gory?

Here’s the best I can do. We take the birds, and their eggs, and what­ever else, as ob­jects in a cat­e­gory, and we some­how cook up some mor­phisms re­lat­ing them. The cat­e­gory will be bizarre and jury-rigged be­cause none of the things we care about are re­ally very cat­e­gor­i­cal, but its struc­ture will some­how cor­re­spond to some of the things about the birds that we care about. And then we make what­ever sort of math­e­mat­i­cal or com­pu­ta­tional model of the birds we would have made with­out cat­e­gory the­ory. So now in­stead of birds and eggs we have tu­ples (po­si­tion, ve­loc­ity, num­ber of eggs sat on) or ob­jects of C++ classes or some­thing. Now since we’ve de­signed our math­e­mat­i­cal model to match up, kinda, to what the birds ac­tu­ally do, maybe we can find a mor­phism be­tween these two jury-rigged cat­e­gories cor­re­spond­ing to “mak­ing a math­e­mat­i­cal model of”. And then maybe there’s some cat­e­gory-the­o­retic thing we can do with this model and other math­e­mat­i­cal mod­els of birds, or some­thing. But I gravely doubt that any of this will ac­tu­ally de­liver any in­sight that we didn’t our­selves put into it. I’d be in­trigued to be proved wrong.

• That a con­struc­tion is free doesn’t mean that you lose noth­ing. It means that if you’re go­ing to do some con­struc­tion any­way, you might as well use the free one, be­cause the free one can get to any other. (At­tain­able util­ity any­one?)

Show­ing that your con­struc­tion is free means that all you need to show as worth­while is con­struct­ing any cat­e­gory from our quiver. Ad­junc­tions are a fine rea­son, though I wish we could in­tro­duce ad­junc­tions first and then show that we need cat­e­gories to get them.