Sets and Functions

Sets and func­tions are two of the most ba­sic ideas in math­e­mat­ics, and we’ll need to know what they are to dis­cuss some things about cat­e­gories rigor­ously. Nor­mally you’d learn about sets and func­tions way be­fore en­coun­ter­ing cat­e­gory the­ory, but in the spirit of as­sum­ing as lit­tle math as pos­si­ble, we should write this post. It’s also worth ad­dress­ing a few mat­ters for their con­cep­tual rele­vance.

Sets are imag­i­nary bags we put things into. For ex­am­ple, you can take a dog, a cat, and a shoe, put them in an imag­i­nary bag, and now you have a set con­sist­ing of . The mem­bers of the set—dog, cat, and shoe—are called the el­e­ments of the set.

A sub­tle but im­por­tant as­pect of a set is that the imag­i­nary bag has to be defined by a rule. This rule can be pretty much any­thing, like “put into a bag ev­ery­thing I’m point­ing at,” but it does have to be a rule. Typ­i­cally, sets can fit pretty much any­thing in, and so you can of­ten just say “here is my set” rather than hav­ing to be ex­plicit about the rule. We’ll get back to why the rule mat­ters at the end. For now, sets are imag­i­nary bags that you can put pretty much any­thing into.

What are we putting into these bags, ex­actly? Pretty much any­thing, yes—but clearly we aren’t ac­tu­ally putting dogs, cats, and shoes into bags. Math­e­mat­i­cally, what are these things?

That is to say, what’s the differ­ence be­tween the set and the set ?

Well, what’s the differ­ence be­tween the equa­tions and ? Noth­ing but the name of the vari­able—which does not mat­ter at all. We could call any­thing. We could rep­re­sent it with a thirty-foot tall wa­ter­color of a fire truck.

So what’s the differ­ence be­tween the set {dog} and the set {cat}? Only the name of the el­e­ment—which does not mat­ter at all. Just like we can take posets like and and rep­re­sent their com­mon struc­ture ab­stractly as , we can do the same for and with this set: .

The set is what a one-el­e­ment set like {dog} or {cat} re­ally is. There’s no math­e­mat­i­cal mean­ing to that ac­tu­ally makes the el­e­ment of this set a four-legged bark­ing crea­ture. It’s just an el­e­ment that hap­pens to be la­beled “dog.”

So why do we care about sets? Set the­ory is re­ally im­por­tant to math­e­mat­ics, but from a cat­e­gory-the­o­retic per­spec­tive, they ac­tu­ally aren’t very im­por­tant at all. In­stead, sets serve one ma­jor pur­pose: sets let us define func­tions, and func­tions are re­ally, re­ally im­por­tant!

Func­tions are maps be­tween sets that meet a few rules. First of all, let’s just talk about the “maps” part of that. Think of sets as coun­tries, and the el­e­ments of the sets as cities in that coun­try. A map be­tween the two sets is a map tel­ling you how to go from the cities in one coun­try to the cities in the other coun­try.

But of course, math is a lit­tle more spe­cific than that. So let’s say you have one set and an­other set . What does it mean to define a map—a mor­phism, in fact—from to ?

Well, it’s pretty sim­ple in the end. You have to start from a city in , so one of , or . And you have to end up in a city in , so one of or . Let’s say you go from to . Then the map is just...the set of where you started from and where you ended up. That is, and , re­spec­tively.

That’s it! It’s a short trip. There’s not much to sets in the first place, so there’s not much to maps be­tween them. (Tech­ni­cally, sets have no struc­ture—they’re imag­i­nary bags filled with black dots—and so the maps be­tween them are as sim­ple as a map across a coun­try with no ge­og­ra­phy.) But let’s point out one thing: we do need this map to tell us where we started and where we ended. In a set, the or­der of the el­e­ments doesn’t mean any­thing. For ex­am­ple, the set and the set are liter­ally iden­ti­cal. The only rea­son the el­e­ments are even writ­ten in an or­der at all is be­cause there’s no other way to dis­play text.

To get our el­e­ments in or­der, so that we can show where we started from and where we ended up, we use some­thing called an or­dered pair, which you’ve prob­a­bly seen from do­ing Carte­sian co­or­di­nates. When we have a map tel­ling us to go from to , we rep­re­sent that as an or­dered pair . The or­dered pair means “we started at and ended up at .”

Although sets don’t have or­ders, we can have sets of or­dered pairs (since we can put pretty much any­thing into sets), and in that way we can rep­re­sent or­der in a set. For ex­am­ple, you can have the set con­sist­ing of just the or­dered pair . That would be writ­ten .

So what does it mean to define a map from to ? It means defin­ing a set of or­dered pairs, the first part of the or­dered pair com­ing from the set and the sec­ond part of the or­dered pair com­ing from the set .

That is to say, a map is defined as a set whose el­e­ments are or­dered pairs , where is an el­e­ment in and is an el­e­ment in .

So for ex­am­ple, we could have a map with the fol­low­ing di­rec­tions: This maps says, “If you’re in , go to . If you’re in , go to . If you’re in , go to .” All such maps are called bi­nary re­la­tions, be­cause they define re­la­tion­ships be­tween pairs of things. For ex­am­ple, the map just given defines re­la­tion­ships be­tween and , and , and and .

We could define all sorts of maps based on this defi­ni­tion. You could make “par­tial” maps that tell you where to go if you’re in , but not if you’re in or . You could make “choose your own ad­ven­ture” maps that have fork­ing di­rec­tions, e.g., a map hav­ing both and in them.

What’s the best map? That is un­ques­tion­ably a spe­cial kind of bi­nary re­la­tion known as a func­tion. “Best” might be a strong word, but func­tions have two spe­cial prop­er­ties that have made it the most im­por­tant type of map, and in­deed mor­phism, in all of math­e­mat­ics.

The first prop­erty of func­tions is that they provide you in­struc­tions for go­ing from to for ev­ery “city” in . Let’s move away from the coun­tries-and-cities metaphor and con­sider the study of sci­ence. Think of the el­e­ments of as pos­si­ble states of the world. As sci­en­tists, we’d like a sci­en­tific rule that gives us pre­dic­tions for ev­ery pos­si­ble state of the world, i.e., some­thing that pro­vides a map for ev­ery el­e­ment of . That’s some­thing a func­tion does—this prop­erty is called to­tal.

The sec­ond prop­erty of func­tions is some­thing called uni­valence, which means that you only get one map­ping for ev­ery city you could be start­ing from. That is to say, if your func­tion tells you to do and , it must be the case that and are com­pletely iden­ti­cal, i.e., . Hav­ing a map to two differ­ent cities start­ing from the same city is strictly dis­al­lowed by uni­valence.

Let’s re­late uni­valence to sci­ence as well. Ba­si­cally, it cap­tures the idea of de­ter­minism. If a state of the world yields a par­tic­u­lar out­put, and then you re­set things to that ex­act state of the world again, it had bet­ter yield the same out­put again. I.e., if you have a state of the world , and you ob­serve that yields out­put and also yields out­put , the out­puts and had bet­ter be the same out­put that ac­ci­den­tally got two differ­ent la­bels ap­plied to it.

So be­tween to­tal­ity and uni­valence, a func­tion ba­si­cally cap­tures the idea of “mak­ing de­ter­minis­tic pre­dic­tions for ev­ery­thing.” Which is ex­actly what sci­ence is all about.

We can com­bine to­tal­ity and uni­valence into this sin­gle rule: A func­tion is a set of or­dered pairs where each el­e­ment of shows up in one and only one or­dered pair. That is to say, is definitely in an or­dered pair, as guaran­teed by to­tal­ity. But by uni­valence, it will only show up once: if you have , then you won’t also have .

You should know that while a func­tion can’t give you and un­less , it can to­tally give you and . That would just be say­ing that two states of the world end up at the same point, which cer­tain seems like a sci­en­tific pos­si­bil­ity.

Now we’re go­ing to ad­dress sets and func­tions from a cat­e­gory the­o­retic per­spec­tive. In fact, we’re go­ing to make a cat­e­gory.


Let’s build a cat­e­gory whose ob­jects are sets and whose mor­phisms are func­tions.

The first step is to make sets into ob­jects. We do this by say­ing that sets are ob­jects. Se­ri­ously, it’s that sim­ple—ob­jects don’t re­ally have any prop­er­ties at all aside from their mor­phisms, so there’s noth­ing more to this step.

The next step is to make our func­tions into mor­phisms. We do this by say­ing that func­tions are mor­phisms, and then we check that func­tions obey the rules of mor­phisms in a cat­e­gory.

First, do­main and codomain. Sets serve as the do­main and codomain of func­tions, and since sets are our ob­jects, the func­tions will clearly have the ob­jects of this cat­e­gory as their do­main and codomain.

So the first real thing to figure out is com­po­si­tion. Let’s say we have and as sets, and func­tions and . Com­po­si­tion would re­quire that we have an­other func­tion such that .

Let’s break this re­quire­ment down. The func­tion starts with the el­e­ments in and re­turns some el­e­ments in . That is to say, we have a set of or­dered pairs of the sort , where comes from and comes from . Say that con­sists of and B is . The func­tion al­lows us to speci­fi­cally as­sign an el­e­ment in to each el­e­ment of . That is to say, we can ask, what is the spe­cific (i.e., only one) el­e­ment of cor­re­spond­ing to ? It could be , for ex­am­ple. If so, then . And we can ask the ques­tion next for and . We might reuse el­e­ments of or not. Let’s say the full func­tion gives .

Hav­ing as­signed el­e­ments of to el­e­ments of in this way, we could think of el­e­ments of as “hid­ing in” el­e­ments of . For ex­am­ple, is hid­ing in . (That is to say, the func­tion f is hid­ing in —it doesn’t get to hide there au­to­mat­i­cally.)

Next we have , which speci­fi­cally as­signs an el­e­ment in to each el­e­ment of . Say that con­sists of . Let’s say gives .

Now let’s re­veal our hid­den sol­diers. The el­e­ments and were hid­ing in , and was hid­ing in . Nat­u­rally, they am­bush the sol­diers of like so: .

(Why is this am­bush al­lowed? Be­cause means , and means . Sub­sti­tut­ing for , we have .)

Is that “am­bush” set a func­tion? Yes, it has each el­e­ment of in an or­dered pair, and each el­e­ment is in only one pair. Is it a func­tion ? Yes, the “first part” of each or­dered pair comes from and the “sec­ond part” of each comes from . Can we call this func­tion ? Yes, we just la­bel it that for free. Is this func­tion the same as do­ing first, and then ? Yes, that’s ex­actly what we just saw.

So func­tions com­pose. Check mark.

Now let’s prove that these func­tions com­pose as­so­ci­a­tively. Say you have func­tions . We want to show that .

Let’s plug in an ar­bi­trary el­e­ment of . Our func­tions carry that el­e­ment through to some el­e­ment of , and we want to know if it’s the same re­gard­less of how you group the func­tions. So let’s see if

We know that com­po­si­tion is al­lowed (we’re prov­ing that com­po­si­tion is as­so­ci­a­tive, af­ter all). So let’s com­pose. Say that and . Now we can sim­plify to ask­ing if .

Well, what is ? It’s a map­ping of an el­e­ment from through to , and from through to . And is equal to the path that go­ing through from through to , and from there through to . So over­all, maps an el­e­ment from through to , through to , and through to .

And what is ? It’s a map­ping of an el­e­ment from through to , and from through to . And since is equal to the path that goes from through to , and from there through to . So over­all, maps an el­e­ment from through to E, through to , and through to .

Which is ex­actly what we just said about . Be­cause they’re both car­ry­ing the same el­e­ment through the same func­tions, they have to end up at the same el­e­ment in on pain of vi­o­lat­ing uni­valence. So they’re equal for the case of , and since is an ar­bi­trary el­e­ment in an ar­bi­trary set, it works in gen­eral. Thus, com­po­si­tion is as­so­ci­a­tive.

Fi­nally we need each set to have an iden­tity mor­phism. Be­cause our mor­phisms are func­tions, this will be the iden­tity func­tion. This is as sim­ple as ask­ing whether, for any set , we can define a func­tion that does noth­ing.

Here’s an ex­am­ple of such a func­tion. Say . Then a func­tion would do noth­ing if it gave . That is to say, each el­e­ment just gets mapped back to it­self.

Let’s show that does noth­ing. I.e., if you have a func­tion , the com­po­si­tion is just equal to . Well, duh! The func­tion is a map­ping of el­e­ments from to . So if you keep all the el­e­ments where they were in , then is just go­ing to be what was go­ing to be if you didn’t do any­thing in the first place.

Ob­vi­ously, you can just pair up each el­e­ment with it­self for any set, and that’s go­ing to keep not do­ing any­thing no mat­ter how big the set gets. So ev­ery set has an iden­tity func­tion. (And only one, at that—there’s only one way to pair up each el­e­ment with it­self.)

And now we’re done. We’ve just shown how sets and func­tions can be con­sid­ered a cat­e­gory: sets are ob­jects by dec­la­ra­tion, and we can show that func­tions be­tween sets com­pose as­so­ci­a­tively, and an iden­tity func­tion can be defined for ev­ery set.


You might be won­der­ing if we could have defined a cat­e­gory with sets as ob­jects and bi­nary re­la­tions as mor­phisms. In­deed we can, in pretty much the same way as we did with sets and func­tions. In fact, since func­tions are just par­tic­u­lar types of bi­nary re­la­tions, prov­ing that bi­nary re­la­tions meet the rules of a cat­e­gory in gen­eral would have proved it for the spe­cific case as well. In fact, the cat­e­gory of sets and func­tions is a sub­cat­e­gory of the cat­e­gory of sets and bi­nary re­la­tions.

Yet it is the case that the cat­e­gory of sets and func­tions gets the sig­nifi­cant name Set, whereas the cat­e­gory of sets and bi­nary re­la­tions gets the much less in­ter­est­ing name Rel. That’s be­cause it is the cat­e­gory-the­o­retic per­spec­tive that ev­ery­thing that’s in­ter­est­ing about a cat­e­gory is con­tained in its mor­phisms. Func­tions are ba­si­cally the most im­por­tant mor­phism in math­e­mat­ics, so we give the cat­e­gory for which func­tions are mor­phisms the name Set—we care about sets be­cause, more than any­thing, they let us define func­tions.


One last note on defin­ing the cat­e­gory of sets, Set. You may have heard of Rus­sell’s para­dox, which says you can’t have the set of all sets. That’s be­cause sets have to be defined ac­cord­ing to a rule, as we said in the be­gin­ning. What if you try to define a set of all sets that are not el­e­ments of them­selves? Then one pos­si­bil­ity is that this set is ei­ther an el­e­ment of it­self, in which case it is in­cluded, in which case we have vi­o­lated the rule just laid down. But the only other pos­si­bil­ity is that it is not an el­e­ment of it­self, in which case it should be an el­e­ment of it­self ac­cord­ing to its rule. But we just saw why we can’t let it be an el­e­ment of it­self. So we bounce back and forth in eter­nal para­dox.

So we can’t have a set of all sets. Then how can we have a cat­e­gory of all sets? We’ll dis­cuss that in the next post, which will ad­dress this is­sue and use our new un­der­stand­ing of sets to more prop­erly for­mal­ize the defi­ni­tion of a cat­e­gory given a cou­ple of posts ago.

Ad­di­tion­ally, we’ll look at two other in­ter­est­ing things about sets. We’ll see how, al­though sets are bags of black dots, we can use func­tions to give those black dots mean­ing. It’s the cat­e­gory-the­o­retic view that ev­ery­thing you want to know about an ob­ject is found in the ob­ject’s mor­phisms. Fur­ther­more, we’ll also see that there’s two spe­cial kinds of sets which can be though of as the “min­i­mal” and “max­i­mal” set, re­spec­tively. In do­ing so, we’ll get our first tastes of the two main goals of this se­ries, the Yoneda lemma and ad­joint func­tors.


In­ci­den­tally, writ­ing these posts has been shock­ingly ex­haust­ing and time-ab­sorb­ing. So af­ter this next post, I don’t ex­pect any­thing fur­ther on the week­end, al­though I may try to an­swer com­ments then. Five posts a week is not sus­tain­able. 2-3 a week is prob­a­bly more rea­son­able. This ex­pe­rience has given me a lot of re­spect for peo­ple who keep up daily blogs or write books. Thanks very much to peo­ple read­ing and com­ment­ing on these so far. It’s very use­ful to have feed­back and grat­ify­ing to know that peo­ple are in­ter­ested in cat­e­gory the­ory.

I am go­ing to work as well on low­er­ing the cost to my­self of cre­at­ing draw­ings to aid ex­pla­na­tions. I have always ex­plained this ma­te­rial in the past with easy ac­cess to pen and pa­per, and some­how it never quite oc­curred to me that even sketch­ing out a few dots and ar­rows is much more of a pain on a com­puter. Tak­ing recom­men­da­tions, if you have any.