Categories: models of models

Let me clar­ify what I mean when I say that math con­sists of nouns and verbs. Think about el­e­men­tary school math­e­mat­ics like ad­di­tion and sub­trac­tion. What you learn to do is take a bunch of nouns—1, 2, 3, etc.—and a bunch of verbs—ad­di­tion, sub­trac­tion—and make sen­tences. “1 + 2 = 3.”

When you make a sen­tence like that, what you’re do­ing is tak­ing an ob­ject, 1, and ob­serv­ing how it changes when it in­ter­acts—speci­fi­cally, adds—with an­other ob­ject, 2. You ob­serve that be­comes a 3. Just like how you can ob­serve a per­son (ob­ject) bump their head (in­ter­ac­tion) into a wall (other ob­ject) and the cor­re­spond­ing change (equals sign): the bluish bump emerg­ing on their fore­head.

Well, it turns out that no mat­ter how far you go in math, you’re still tak­ing ob­jects and ob­serv­ing how they change when they in­ter­act with other ob­jects. You learn about other kinds of num­bers like 2.5 and 3.333 re­peat­ing, and other kinds of in­ter­ac­tions like mul­ti­pli­ca­tion and di­vi­sion.

Even­tu­ally things start get­ting more ab­stract. You learn about ma­tri­ces and the kinds of in­ter­ac­tions they can have. You learn about sets and func­tions in a deep way. Com­plex num­bers, topolo­gies, etc.

But you never get away from nouns and verbs.

What are we go­ing to do with all of these nouns and verbs? Well, at some point, when you have a bunch of things that feel similar in an im­por­tant way, you of­ten want to take a step back, lump all of the in­di­vi­d­ual things to­gether, and talk about them in gen­eral. Why do we have the con­cept of fruit? Be­cause we live in a world where ap­ples, or­anges, lemons, pa­payas, etc., ex­ist.

Why do we have the con­cept of num­bers? Be­cause we have a bunch of them, not just 1, 2, and 3, but also 2.5, 3.333 re­peat­ing, ir­ra­tional num­bers, and even com­plex num­bers. Why do we have the con­cept of “op­er­a­tions?” Be­cause we have ad­di­tion, sub­trac­tion, mul­ti­pli­ca­tion, and di­vi­sion. Even though all of these give you very differ­ent an­swers for the same in­put of num­bers, (1 + 2 does not equal 1 − 2 does not equal 1 times 2 does not equal 1 di­vided by 2), they still have some­thing very similar in com­mon, so it’s worth com­ing up with the con­cept of “op­er­a­tion” to study them in­de­pen­dently of their in­di­vi­d­ual char­ac­ter­is­tics. Just like how we can talk about “fruit” with­out hav­ing to refer­ence the shape of an ap­ple.

If you lived in a world with only ap­ples, you wouldn’t have the con­cept of fruit. If you lived in a world where the only kind of thing was a rock, you wouldn’t have the con­cept of nouns. If you lived in a world where the only kind of math-thing was num­bers, cat­e­gory the­o­rists wouldn’t have come up with the con­cept of ob­jects. (They wouldn’t have come up with cat­e­gory the­ory at all!)

And what is the use of gen­er­al­iza­tion? A bird’s-eye view of some­thing lets you see things from new per­spec­tives, po­ten­tially chang­ing your con­cept of the thing en­tirely. When you think of num­bers as var­i­ous ways you can count stuff on your fingers, you can inch your way past the nat­u­ral num­bers to nega­tives, frac­tions and ir­ra­tional num­bers. But com­plex num­bers will throw you for a to­tal loop—they just don’t seem to be re­lat­able to “amounts of things” in a way that you’re ever likely to see in the world.

In fact, com­plex num­bers made no sense to me un­til I learned what num­bers ac­tu­ally are, at which point ev­ery­thing clicked into place. The gen­er­al­iza­tion helped me un­der­stand the spe­cific case—com­plex num­bers are just an­other type of num­ber, like how ap­ples are just an­other type of fruit.

Gen­er­al­iza­tion is helpful when life stops throw­ing con­ve­nient ex­am­ples at you.

For ex­am­ple, say you want to grow new kinds of fruit that have never ex­isted. Hav­ing a con­cept of fruit is nec­es­sary to con­ceiv­ing of that idea. Life’s not go­ing to give you ex­am­ples of fruit that have never ex­isted! You have to ex­plore the con­cep­tual space of all fruit.

(In fact, this ex­pe­rience with com­plex num­bers, years ago, prob­a­bly in­spired these posts. I got the idea that if you just both­ered to take a re­ally long time to ex­plain the math­e­mat­ics, the av­er­age per­son could prob­a­bly grasp quan­tum physics. This se­ries is the cat­e­gory-the­ory ver­sion of that origi­nal idea.)

Cat­e­gory the­ory ex­ists for the same rea­son the con­cept of fruit does: there are lots of in­di­vi­d­ual things that have cer­tain com­mon­al­ities that make gen­er­al­iza­tion an ap­peal­ing idea. Ap­ples and or­anges are very differ­ent on the sur­face and very similar deep down. Nat­u­ral num­bers and com­plex num­bers are very differ­ent on the sur­face and very similar deep down.

Cat­e­gory the­ory goes one step wider. It emerges when peo­ple look at en­tire fields of math like “alge­bra” and “topol­ogy” and no­tices that, while they’re very differ­ent on the sur­face, they seem to be very similar deep down. Many other fields of math­e­mat­ics seemed to also share these deep similar­i­ties, and so grad­u­ally they all be­came mere ex­am­ples of the gen­er­al­iza­tion, that gen­er­al­iza­tion be­ing a cat­e­gory. (A cat­e­gory be­ing some­thing we’ll define by the end of this post.) Just like how ap­ples and or­anges be­come mere ex­am­ples of the gen­er­al­iza­tion that is “fruit.”

And one of those com­mon­al­ities is that all of these su­perfi­cially dis­parate fields of math­e­mat­ics study things and in­ter­ac­tions be­tween those things. I.e., they study ob­jects and mor­phisms.

That might sound re­ally gen­eral. And it is! And yet, just like with the gen­eral defi­ni­tion of fields that lets us un­der­stand com­plex num­bers, we can learn re­ally in­ter­est­ing things from this su­per-gen­eral per­spec­tive. (Speci­fi­cally, the Yoneda lemma and ad­junc­tion.)

But right now you are hav­ing to take my word for the idea that many differ­ent fields of math can be thought of as study­ing “nouns and verbs.” So let’s look at things from a differ­ent per­spec­tive.

Even if you don’t know higher maths, you prob­a­bly know things like, “pour­ing milk in my ce­real will make my ce­real soggy.”

So “milk + ce­real = soggy ce­real.” Seems awfully...math­e­mat­i­cal.

Why does math ex­ist, any­way? Well, there’s lots of ways to an­swer that ques­tion, so let’s rephrase it: why do math­e­mat­i­ci­ans ex­ist? Or even bet­ter, why do math­e­mat­i­ci­ans get paid? It cer­tainly isn’t for the joy of do­ing math. In­stead, “math­e­mat­i­cian” is a job that you can ac­tu­ally get paid to do be­cause math is very use­ful for mod­el­ing our re­al­ity.

So why does math boil down to ob­jects and mor­phisms so of­ten? Prob­a­bly for the same rea­son English boils down to nouns and verbs: we use lan­guage to dis­cuss re­al­ity, and re­al­ity seems to boil down to nouns and verbs.

Take birds, for ex­am­ple. They are birds, so they’re nouns (ob­jects). And they do stuff like fly, tweet, lay eggs, eat, etc. I.e., verbs (mor­phisms).

What­ever you may or may not know of maths, you definitely know a thing or two about re­al­ity. You’ve been liv­ing in it your whole life.

The rest of this post will take com­mon sense ideas about cre­at­ing mod­els of our re­al­ity, break them down into their most ab­stract com­po­nents, and end up with some sim­ple rules that any model should fol­low. We’ll dis­cover that those com­po­nents and rules are ex­actly the com­po­nents and rules that define a cat­e­gory.


You know a lot of mod­els, even if most of them would never make it into a text­book. You use these mod­els to nav­i­gate re­al­ity.

For ex­am­ple, a sen­tence like “Alice pushes Bob” is a model of re­al­ity. By them­selves, those let­ters in that or­der are mean­ingless. But be­cause you can use them to make pre­dic­tions (speci­fi­cally, you’re in­fer­ring the speaker’s state of mind), the sen­tence is cor­re­spond­ingly a model of (that par­tic­u­lar part of) re­al­ity.

Sen­tences them­selves ex­ist, so we can model them too. You can think of a sen­tence struc­ture as a way of mod­el­ing a sen­tence. As for mod­els them­selves, a model is a way of ab­stract­ing away from some de­tails of what­ever you’re study­ing so that other fea­tures be­come more salient. For ex­am­ple, you can model a spe­cific cat, Mr. Peanuts, as just a cat, ne­glect­ing spe­cific fea­tures like his name, his color, etc., so that other, more gen­eral fea­tures of cats, like the ten­dency to meow, eat tuna fish, and ut­terly de­spise hu­mans be­come more salient.

That is to say, what will Mr. Peanuts do in [spe­cific situ­a­tion]? Hard to say un­less you know him per­son­ally. But if we ask, “What will a cat do in [spe­cific situ­a­tion]?” you might have a de­cent guess. The cat’s not go­ing to start bark­ing, for one.

You have a model of cats in­side your head, and this lets you make pre­dic­tions, at the ex­pense of speci­fic­ity.

Sen­tence struc­ture mod­els spe­cific sen­tences like “Alice pushes Bob” in terms of their ab­stract com­po­nents: “Noun verbs noun.”

You might be sur­prised to learn that you can make pre­dic­tions with sen­tence struc­ture. For ex­am­ple, one of the rules of gram­mar is that a prepo­si­tional phrase ends with a noun. So let’s say I ask you, “Who will the soc­cer player kick the ball to?” You don’t know the an­swer—I haven’t even given you a list of peo­ple to choose from.

Let’s also say that you don’t know what the word “who” means, so you don’t even know what I’m ask­ing. In fact, let’s say you don’t know what any of the words in the sen­tence means, only their parts of speech. You cer­tainly aren’t go­ing to give the spe­cific an­swer I’m look­ing for.

But you do know sen­tence struc­ture! You know that “to” is a prepo­si­tion, so the phrase it opens up must end with a noun.

So who is the soc­cer player go­ing to kick the ball to? An­swer: a per­son, place, or thing—a noun.

This is not a brilli­ant an­swer. But it is an an­swer—a cor­rect one, in fact! This su­per ab­stract model of sen­tences let you figure things out even when you didn’t know what any of the spe­cific words meant.

So sen­tences are mod­els, and sen­tence struc­ture is a model of a model, and mod­els are re­ally use­ful: they let you pre­dict things.

But sen­tence struc­ture is a model of just one kind of model—sen­tences them­selves. What if we wanted a model of all kinds of differ­ent mod­els—sen­tences, sen­tence struc­ture, sci­en­tific mod­els, mod­els from sen­sory data, math­e­mat­i­cal mod­els, ev­ery­thing? What if we wanted to model mod­el­ing it­self?

Why, we’d turn to cat­e­gory the­ory.

So what are the gen­eral qual­ities of all mod­els?

First of all, ev­ery model has things—ob­jects—and ac­tions that the ob­jects to do other ob­jects—mor­phisms. For ex­am­ple, you can model a per­son by see­ing them. That is to say, you hit them with some pho­tons and form a pic­ture of them based on the re­sult. Every model is ul­ti­mately found in how one ob­ject changes an­other ob­ject—hence the term “mor­phism.”

(No­tice how sen­tences, which them­selves boil down to nouns and verbs, can be thought of as mod­els of any­thing. After all, it’s hard to imag­ine any kind of phe­nomenon that can’t be ex­pressed in a sen­tence, even if you have to make up some new words to do it. Effec­tively, the rea­son we use cat­e­gory the­ory to form mod­els of mod­els of ev­ery­thing in­stead just stick­ing with English is be­cause we can define our cat­e­gory as obey­ing cer­tain rules or ax­ioms that cor­re­spond to how mod­els ought to work, while English can say pretty much any­thing. The rest of this post dis­cusses these rules.)

So cat­e­gories—mod­els of mod­els—con­sist of ob­jects and mor­phisms. But what are the rules we re­quire our ob­jects and mor­phisms to obey so that they match up with our ideas of how mod­els of re­al­ity ought to work? One is that ev­ery ob­ject ought to be, in some sense, a perfect model of it­self.

Think about what mod­els do: they trans­form how we view ob­jects. If I tell you some­one is a mur­derer, this changes how you view them be­cause you ap­ply a differ­ent model. Loosely, you could say that you ap­ply the “mur­derer” mor­phism to the per­son in ques­tion. Well, ev­ery ob­ject ought to have a model that you can ap­ply that doesn’t trans­form the ob­ject at all—not be­cause you aren’t trans­form­ing, but be­cause there’s no effect, like mul­ti­ply­ing by 1. For ex­am­ple, if you ap­ply the “Mr. Peanuts” mor­phism to Mr. Peanuts, this shouldn’t change any­thing: Mr. Peanuts is already Mr. Peanuts. This mor­phism is, in a cer­tain clear sense, the iden­tity of Mr. Peanuts.

So per­haps un­sur­pris­ingly, in cat­e­gory the­ory, we say that ev­ery ob­ject has an iden­tity mor­phism that “does noth­ing.” For an ar­bi­trary ob­ject its iden­tity mor­phism is writ­ten That’s be­cause it’s like mul­ti­ply­ing by 1—you are mul­ti­ply­ing, it’s just that noth­ing changes. (But just like mul­ti­ply­ing both sides by 1 helps you solve equa­tions, the iden­tity mor­phism is ex­tremely use­ful.)

What does it ac­tu­ally mean, math­e­mat­i­cally, for the iden­tity mor­phism to “do noth­ing?” To an­swer that, let’s look at an­other re­quire­ment we’d ex­pect of a gen­eral model of mod­els: com­po­si­tion­al­ity.

Say Alice pushes Bob, and Bob bumps into Charles. Let’s di­a­gram this as Hope­fully the in­ter­pre­ta­tion of this di­a­gram is ob­vi­ous. We could say that Alice af­fects Bob, and Bob af­fects Charles—that is to say, Bob’s state is a model of Alice’s ac­tion on Bob, and Charles’s state is a model of Bob’s ac­tion on Charles. (I know we used to stand for push last time, but and are the typ­i­cal “generic mor­phism” sym­bols in cat­e­gory the­ory, and it’s bet­ter to get used to us­ing generic sym­bols than try­ing to name them af­ter things all the time. Here, is Alice push­ing Bob, and is Bob bump­ing into Charles.)

But we also might have an in­tu­ition that Alice af­fects Charles. Sure, she does so in­di­rectly through Bob, but clearly Charles’s state is a model of Alice’s ac­tions—ac­tions which are in some sense ac­tions on Charles.

That is to say, Alice pushed Charles just as clearly as she pushed Bob.

Effect flows through. We’re all just mod­els of the origi­nal state and laws of the uni­verse, what­ever they were. So when­ever we have some­thing like we should re­ally ex­pect to also have some­thing like More­over, this mor­phism should be equal to what hap­pened when Alice pushed Bob and then Bob bumped into Charles. After all, that is what Alice did to Charles.

Let’s use the sym­bol to mean “fol­low­ing.” So if we write , that reads as “g fol­low­ing f.” (Yes, that means you read it right to left, in effect. Get used to it, ev­ery field has at least one coun­ter­in­tu­itive piece of no­ta­tion.) We would then say that . More to the point, the idea is that when­ever you have some­thing that looks like , re­gard­less of what these ob­jects and mor­phisms are meant to stand in for, then you nec­es­sar­ily have such that .

This is what com­po­si­tion looks like as a di­a­gram. (The back­wards E means “there ex­ists,” and the dot­ted line, re­dun­dantly, means the same thing.)

An ex­am­ple of that you are hope­fully fa­mil­iar with is func­tion com­po­si­tion. (We’ll cover it in an­other post any­way.) Say you have You should know how to eval­u­ate this: first, square the , then mul­ti­ply it by .

Let’s see how this is ex­actly like what we did be­fore. You can see it like this: say that stands for the ac­tion of squar­ing the (it’s a trans­for­ma­tion—a mor­phism). And the other mor­phism stands for mul­ti­ply­ing it by .

If you’re given the prob­lem of de­ter­min­ing what you have if you do fol­low­ing , you can write that like . Plug­ging in the ac­tual equa­tions, you are told to “mul­ti­ply by fol­low­ing squar­ing .” Which is ex­actly what you did.

If you were told to de­ter­mine in­stead, you should be able to see that this tells you to “square fol­low­ing mul­ti­ply­ing by .” I.e., .

The sym­bol in fact tells you that you are com­pos­ing one mor­phism with an­other (ex­actly how com­po­si­tion works is defined by the cat­e­gory), and the re­quire­ment that im­plies such that is called com­po­si­tion­al­ity. Every cat­e­gory has a rule of com­po­si­tion.

Com­po­si­tion is the real ac­tion of any cat­e­gory. If you know the rule of com­po­si­tion, you know how the cat­e­gory works in­ter­nally. (Think about the con­cept of the laws of physics, the view of the world as a chain of events such that we can pre­dict our cur­rent state from the Big Bang. Un­der­stand­ing that rule of com­po­si­tion is the the­ory of ev­ery­thing, the holy grail of physics.)

Now that we know what com­po­si­tion is, we can use it to rigor­ously define how the iden­tity mor­phism “does noth­ing.” Sup­pose you have an ob­ject and the iden­tity mor­phism . Say you also have a mor­phism . Since is an ob­ject, it has an iden­tity mor­phism as well, .

The iden­tity mor­phism is a mor­phism “on “—what does that mean? It means that goes from to . It goes nowhere, in effect—just what we’d ex­pect of an iden­tity mor­phism.

So we have a mor­phism . We also have a mor­phism , as stated. Well, look at this! Ac­cord­ing to the rule of com­po­si­tion, we must have a mor­phism such that .

But we already had a mor­phism go­ing from to , it­self. The iden­tity mor­phism does noth­ing, so it shouldn’t have any effect on what we already had. This is pre­cisely defined by the fol­low­ing con­di­tion:

That is to say, do­ing af­ter do­ing is the same as just do­ing . Com­pos­ing by is just like mul­ti­ply­ing by .

And no­tice we also have . And since we still have , we again have a rule of com­po­si­tion say­ing that we have a mor­phism go­ing from to that is equal to .

What is this mor­phism? Again, does noth­ing, mean­ing that .

Ba­si­cally, if you give a nut to a bird, and the nut is the nut, that’s the same as just giv­ing a nut to a bird. Similarly, if you give a nut to a bird, and the bird is the bird, that’s just the same as giv­ing a nut to a bird. Brilli­ant, I know, but some­times math is about piling up triv­ial­ities un­til some­thing profound emerges.

There’s one last rule we’d ex­pect any model to obey. Let’s take again the se­quence of events where Alice pushes Bob, and Bob stum­bles into Charles. Tem­po­rally, Alice pushed Bob be­fore Bob ran into Charles. But log­i­cally, we should be able to study the events out of or­der: we should be able to study Bob’s mo­tion into Charles, and then look back­wards in time to study Alice push­ing Bob. (Just like how we can study Alice push­ing Bob be­fore we look back­wards in time to see how the Big Bang pushed Alice into push­ing Bob.)

This abil­ity to study the events out of or­der as long as ul­ti­mately put things to­gether in the right or­der is called as­so­ci­a­tivity. You might be fa­mil­iar with this in terms of the as­so­ci­a­tivity of mul­ti­pli­ca­tion, for ex­am­ple. Be­cause we say that mul­ti­pli­ca­tion is as­so­ci­a­tive.

Let’s add David into the equa­tion, so that we have a path like We have a new com­po­si­tion . As­so­ci­a­tivity would tell us that

In­deed, in any cat­e­gory, the rule of com­po­si­tion must be as­so­ci­a­tive. This should be in­ter­preted as say­ing that we study at any in­di­vi­d­ual in­ter­ac­tion out of or­der, as long as we don’t for­get what the or­der ac­tu­ally was! (Why “study?” Be­cause we’re mod­el­ing mod­els, and mod­els are for study­ing re­al­ity.)

You could think of as­so­ci­a­tivity this way: Say you’re plan­ning a trip from North Amer­ica to Europe to Asia. As­so­ci­a­tivity says that you can think about the trip from Europe to Asia be­fore you think about the trip from North Amer­ica to Europe so long as you re­mem­ber to leave from North Amer­ica when it’s time to take the trip!

And...we’re done. We have the rules we’d ex­pect any model to obey. And, not co­in­ci­den­tally, we now have the rules we’d ex­pect any model of mod­els to obey—the rules that define a cat­e­gory.

Defi­ni­tion: a cat­e­gory is a math­e­mat­i­cal struc­ture con­sist­ing of

i) a bunch of ob­jects writ­ten etc.

ii) a bunch of mor­phisms writ­ten etc., which obey some rules.

Rule 1: Every mor­phism goes “from” an ob­ject in the cat­e­gory “to” an ob­ject in the cat­e­gory . The “from” ob­ject is called the do­main and the “to” ob­ject is called the codomain. (This should sound similar from learn­ing about func­tions—func­tions are just a type of mor­phism; func­tions have a do­main and codomain be­cause all mor­phisms do.) For ex­am­ple, if you have a mor­phism , then is the do­main of and is the codomain of .

Rule 2: The cat­e­gory must have a rule of com­po­si­tion such that, if the do­main of one mor­phism in is the codomain of an­other mor­phism in then there is a way of com­pos­ing the two mor­phisms to get a sin­gle mor­phism ex­tend­ing from the do­main of the lat­ter mor­phism to the codomain of the former mor­phism such that this new “com­pos­ite” mor­phism is equal to the com­po­si­tion of the lat­ter mor­phism fol­low­ing the former mor­phism. E.g., if you have mor­phisms and , then you nec­es­sar­ily have such that .

Rule 3: Every ob­ject in has an iden­tity mor­phism go­ing from it­self to it­self that does noth­ing. For an ar­bi­trary ob­ject in the iden­tity mor­phism is writ­ten . The iden­tity mor­phism “does noth­ing” in the sense that com­pos­ing it with other mor­phisms is the same as just do­ing the other mor­phism.

Rule 4: Com­po­si­tion is as­so­ci­a­tive. As­so­ci­a­tivity means that if you have a chain of com­po­si­tions , then it is always true that

Why are there so many rules for mor­phisms, and none re­ally for ob­jects? We’ll ex­plore in a few posts from now the idea that cat­e­gories are all about the mor­phisms. Ba­si­cally, we’ll ex­plore the idea that an ob­ject is to­tally defined by what it does and what is done to it—i.e., all the mor­phisms it is a part of.

Com­ing up next are ex­am­ples of cat­e­gories. This next post should hope­fully both clar­ify the claim that cat­e­gory the­ory gen­er­al­izes differ­ent fields of math­e­mat­ics and also help make the con­cept of a cat­e­gory much more con­crete. After­ward, we’ll talk about a very im­por­tant type of cat­e­gory, the cat­e­gory of sets and func­tions, and then a se­ries of posts on both con­cep­tual mat­ters and en­hanc­ing our math­e­mat­i­cal un­der­stand­ing of cat­e­gories.