Book Review: Naive Set Theory (MIRI research guide)

I’m David. I’m read­ing through the books in the MIRI re­search guide and will write a re­view for each as I finish them. By way of in­spira­tion from how Nate did it.

Naive Set Theory

Hal­mos Naive Set The­ory is a clas­sic and dense lit­tle book on ax­io­matic set the­ory, from a “naive” per­spec­tive.

Which is to say, the book won’t dig to the depths of for­mal­ity or philos­o­phy, it fo­cuses on get­ting you pro­duc­tive with set the­ory. The point is to give some­one who wants to dig into ad­vanced math­e­mat­ics a foun­da­tion in set the­ory, as set the­ory is a fun­da­men­tal tool used in a lot of math­e­mat­ics.

Summary

Is it a good book? Yes.

Would I recom­mend it as a start­ing point, if you would like to learn set the­ory? No. The book has a terse pre­sen­ta­tion which makes it tough to di­gest if you aren’t already fa­mil­iar with propo­si­tional logic, per­haps set the­ory to some ex­tent already and a bit of ad­vanced math­e­mat­ics in gen­eral. There are plenty of other books that can get you started there.

If you do have a some­what fit­ting back­ground, I think this should be a very com­pe­tent pick to deepen your un­der­stand­ing of set the­ory. The au­thor shows you the nuts and bolts of set the­ory and doesn’t waste any time do­ing it.

Per­spec­tive of this review

I will first re­fer you to Nate’s re­view, which I found to be a lu­cid take on it. I don’t want to be re­dun­dant and re­peat the good points made there, so I want to fo­cus this re­view on the per­spec­tive of some­one with a bit weaker back­ground in math, and try to give some help to prospec­tive read­ers with parts I found tricky in the book.

What is my per­spec­tive? While I’ve always had a knack for math, I only read about 2 months of math­e­mat­ics at in­tro­duc­tory uni­ver­sity level, and not in­clud­ing dis­crete math­e­mat­ics. I do have a thor­ough back­ground in soft­ware de­vel­op­ment.

Set the­ory has eluded me. I’ve only picked up frag­ments. It’s seemed very fun­da­men­tal but school never gave me a good op­por­tu­nity to learn it. I’ve wanted to un­der­stand it, which made it a joy to add Naive Set The­ory to the top of my read­ing list.

How I read Naive Set Theory

Start­ing on Naive Set The­ory, I quickly re­al­ized I wanted more meat to the ex­pla­na­tions. What is this con­cept used for? How does it fit in to the larger sub­ject of math­e­mat­ics? What the heck is the au­thor ex­press­ing here?

I sup­ple­mented heav­ily with wikipe­dia, math.stack­ex­change and other web­sites. Some­times, I read other sources even be­fore read­ing the chap­ter in the book. At two points, I laid down the book in or­der to finish two other books. The first was Gödel’s Proof, which handed me some friendly ex­am­ples of propo­si­tional logic. I had started read­ing it on the side when I re­al­ized it was con­tex­tu­ally use­ful. The sec­ond was Con­cepts of Modern Math­e­mat­ics, which gave me much of the larger math­e­mat­i­cal con­text that Naive Set The­ory didn’t.

Con­se­quently, while read­ing Naive Set The­ory, I spent at least as much time read­ing other sources!

A bit into the book, I started strug­gling with the ex­er­cises. It sim­ply felt like I hadn’t been given all the tools to at­tempt the task. So, I con­cluded I needed a bet­ter in­tro­duc­tion to math­e­mat­i­cal proofs, or­dered some books on the sub­ject, and post­poned in­vest­ing into the ex­er­cises in Naive Set The­ory un­til I had got­ten that in­tro­duc­tion.

Chapters

In gen­eral, if the book doesn’t offer you enough ex­pla­na­tion on a sub­ject, search the In­ter­net. Wikipe­dia has nu­mer­ous com­pe­tent ar­ti­cles, math.stack­ex­change is overflow­ing with con­tent and there’s plenty ad­di­tional sources available on the net. If you get stuck, do try play­ing around with ex­am­ples of sets on pa­per or in a text file. That’s uni­ver­sal ad­vice for math.

I’ll fol­low with some key points and some high­lights of things that tripped me up while read­ing the book.

Ax­iom of extension

The ax­iom of ex­ten­sion tells us how to dis­t­in­guish be­tween sets: Sets are the same if they con­tain the same el­e­ments. Differ­ent if they do not.

Ax­iom of specification

The ax­iom of speci­fi­ca­tion al­lows you to cre­ate sub­sets by us­ing con­di­tions. This is pretty much what is done ev­ery time set builder no­ta­tion is em­ployed.

Puz­zled by the bit about Rus­sell’s para­dox at the end of the chap­ter? http://​​math.stack­ex­change.com/​​ques­tions/​​651637/​​rus­sells-para­dox-in-naive-set-the­ory-by-paul-halmos

Unordered pairs

The ax­iom of pairs al­lows one to cre­ate a new set that con­tains the two origi­nal sets.

Unions and intersections

The ax­iom of unions al­lows one to cre­ate a new set that con­tains all the mem­bers of the origi­nal sets.

Com­ple­ments and powers

The ax­iom of pow­ers al­lows one to, out of one set, cre­ate a set con­tain­ing all the differ­ent pos­si­ble sub­sets of the origi­nal set.

Get­ting tripped up about the “for some” and “for ev­ery” no­ta­tion used by Hal­mos? Wel­come to the club:
http://​​math.stack­ex­change.com/​​ques­tions/​​887363/​​ax­iom-of-unions-and-its-use-of-the-ex­is­ten­tial-quan­tifier­
http://​​math.stack­ex­change.com/​​ques­tions/​​1368073/​​or­der-of-eval­u­a­tion-in-con­di­tions-in-set-the­o­ry

Us­ing nat­u­ral lan­guage rather than log­i­cal no­ta­tion is com­m­mon prac­tice in math­e­mat­i­cal text­books. You’d bet­ter get used to it:
http://​​math.stack­ex­change.com/​​ques­tions/​​1368531/​​why-there-is-no-sign-of-logic-sym­bols-in-math­e­mat­i­cal-texts

The ex­is­ten­tial quan­tifiers tripped me up a bit be­fore I ab­sorbed it. In math, you can freely ex­press some­thing like “Out of all pos­si­ble x ever, give me the set of x that fulfill this con­di­tion”. In pro­gram­ming lan­guages, you tend to have to be much more… spe­cific, in your state­ments.

Ordered pairs

Carte­sian prod­ucts are used to rep­re­sent plenty of math­e­mat­i­cal con­cepts, no­tably co­or­di­nate sys­tems.

Relations

Equiv­alence re­la­tions and equiv­alence classes are im­por­tant con­cepts in math­e­mat­ics.

Functions

Hal­mos is us­ing some dated ter­minol­ogy and is in my eyes a bit in­con­sis­tent here. In mod­ern us­age, we have: in­jec­tive, sur­jec­tive, bi­jec­tive and func­tions that are none of these. Bi­jec­tive is the com­bi­na­tion of be­ing both in­jec­tive and sur­jec­tive. Re­place Hal­mos’ “onto” with sur­jec­tive, “one-to-one” with in­jec­tive, and “one-to-one cor­re­spon­dence” with bi­jec­tive.

He also con­fused me with his ex­pla­na­tion of “char­ac­ter­is­tic func­tion”—you might want to check an­other source there.

Families

This chap­ter tripped me up heav­ily be­cause Hal­mos mixed in three things at the same time on page 36: 1. A con­fus­ing way of talk­ing about sets. 2. Con­voluted proof. 3. n-ary carte­sian product.

Fam­i­lies are an al­ter­na­tive way of talk­ing about sets. An in­dexed fam­ily is a set, with an in­dex and a func­tion in the back­ground. A fam­ily of sets means a col­lec­tion of sets, with an in­dex and a func­tion in the back­ground. For Hal­mos build-up to n-ary carte­sian prod­ucts, the deal seems to be that he teases out or­der with­out ex­plic­itly us­ing or­dered pairs. Golf clap. Try this one for the math.se treat­ment: http://​​math.stack­ex­change.com/​​ques­tions/​​312098/​​carte­sian-prod­ucts-and-families

In­verses and composites

The in­verses Hal­mos defines here are more gen­eral than the in­verse func­tions de­scribed on wikipe­dia. Hal­mos’ in­verses work even when the func­tions are not bi­jec­tive.

Numbers

The ax­iom of in­finity states that there is a set of the nat­u­ral num­bers.

The Peano axioms

The peano ax­ioms can be mod­eled on the the set-the­o­retic ax­ioms. The re­cur­sion the­o­rem guaran­tees that re­cur­sive func­tions ex­ist.

Arithmetic

The prin­ci­ple of math­e­mat­i­cal in­duc­tion is put to heavy use in or­der to define ar­ith­metic.

Order

Par­tial or­ders, to­tal or­ders, well or­ders—are pow­er­ful math­e­mat­i­cal con­cepts and are used ex­ten­sively.

Some help on the way:
http://​​math.stack­ex­change.com/​​ques­tions/​​1047409/​​sole-min­i­mal-el­e­ment-why-not-also-the-min­i­mum
http://​​math.stack­ex­change.com/​​ques­tions/​​367583/​​ex­am­ple-of-par­tial-or­der-thats-not-a-to­tal-or­der-and-why­
http://​​math.stack­ex­change.com/​​ques­tions/​​225808/​​is-my-un­der­stand­ing-of-an­ti­sym­met­ric-and-sym­met­ric-re­la­tions-cor­rec­t
http://​​math.stack­ex­change.com/​​ques­tions/​​160451/​​differ­ence-be­tween-supre­mum-and-max­i­mum

Also, keep in mind that in­finite sets like sub­sets of w can muck up ex­pec­ta­tions about or­der. For ex­am­ple, a to­tally or­dered set can have mul­ti­ple el­e­ments with­out a pre­de­ces­sor.

Ax­iom of choice

The ax­iom of choice lets you, from any col­lec­tion of non-empty sets, se­lect an el­e­ment from ev­ery set in the col­lec­tion. The ax­iom is nec­es­sary to do these kind of “choices” with in­finite sets. In finite cases, one can con­struct func­tions for the job us­ing the other ax­ioms. Though, the ax­iom of choice of­ten makes the job eas­ier in finite cases so it is used where it isn’t nec­es­sary.

Zorn’s lemma

Zorn’s lemma is used in similar ways to the ax­iom of choice—mak­ing in­finite many choices at once—which per­haps is not very strange con­sid­er­ing ZL and AC have been proven to be equiv­a­lent.

robot-dreams offers some help in fol­low­ing the mas­sive proof in the book.

Well ordering

A well-or­dered set is a to­tally or­dered set with the ex­tra con­di­tion that ev­ery non-empty sub­set of it has a small­est el­e­ment. This ex­tra con­di­tion is use­ful when work­ing with in­finite sets.

The prin­ci­ple of trans­finite in­duc­tion means that if the pres­ence of all strict pre­de­ces­sors of an el­e­ment always im­plies the pres­ence of the el­e­ment it­self, then the set must con­tain ev­ery­thing. Why does this mat­ter? It means you can make con­clu­sions about in­finite sets be­yond w, where math­e­mat­i­cal in­duc­tion isn’t suffi­cient.

Trans­finite recursion

Trans­finite re­cur­sion is an analogue to the or­di­nary re­cur­sion the­o­rem, in a similar way that trans­finite in­duc­tion is an analogue to math­e­mat­i­cal in­duc­tion—re­cur­sive func­tions for in­finite sets be­yond w.

In mod­ern lingo, what Hal­mos calls a “similar­ity” is an “or­der iso­mor­phism”.

Or­di­nal numbers

The ax­iom of sub­sti­tu­tion is called the ax­iom (schema) of re­place­ment in mod­ern use. It’s used for ex­tend­ing count­ing be­yond w.

Sets of or­di­nal numbers

The count­ing the­o­rem states that each well or­dered set is or­der iso­mor­phic to a unique or­di­nal num­ber.

Or­di­nal arithmetic

The mis­be­hav­ior of com­mu­ta­tivity in ar­ith­metic with or­di­nals tells us a nat­u­ral fact about or­di­nals: if you tack on an el­e­ment in the be­gin­ning, the re­sult will be or­der iso­mor­phic to what it is with­out that el­e­ment. If you tack on an el­e­ment at the end, the set now has a last el­e­ment and is thus not or­der iso­mor­phic to what you started with.

The Schröder-Bern­stein theorem

The Schröder-Bern­stein the­o­rem states that if X dom­i­nates Y, and Y dom­i­nates X, then X ~ Y (X and Y are equiv­a­lent).

Countable sets

Can­tor’s the­o­rem states that ev­ery set always has a smaller car­di­nal num­ber than the car­di­nal num­ber of its power set.

Car­di­nal arithmetic

Read this chap­ter af­ter Car­di­nal num­bers.

Car­di­nal ar­ith­metic is an ar­ith­metic where just about all the stan­dard op­er­a­tors do noth­ing (be­yond the finite cases).

Car­di­nal numbers

Read this chap­ter be­fore Car­di­nal ar­ith­metic.

The con­tinuum hy­poth­e­sis as­serts that there is no car­di­nal num­ber be­tween that of the nat­u­ral num­bers and that of the re­als. The gen­er­al­ized con­tinuum hy­poth­e­sis as­serts that, for all car­di­nal num­bers in­clud­ing aleph-0 and be­yond aleph-0, the next car­di­nal num­ber in the se­quence is the power set of the pre­vi­ous one.

Con­clud­ing reflections

I am at the same time hum­bled by the sub­ject and em­pow­ered by what I’ve learned in this epi­sode. Math­e­mat­ics is a truly vast and deep field. To build a solid foun­da­tion in proofs, I will now go through one or two books about math­e­mat­i­cal proofs. I may re­turn to Naive Set The­ory af­ter that. If any­one is in­ter­ested, I could post my im­pres­sions of other math­e­mat­i­cal books I read.

I think Naive Set The­ory wasn’t the op­ti­mal book for me at the stage I was. And I think Naive Set The­ory prob­a­bly should be re­placed by an­other in­tro­duc­tory book on set the­ory in the MIRI re­search guide. But that’s a small com­plaint on an ex­cel­lent doc­u­ment.

If you seek to get into a new field, know the pre­req­ui­sites. Build your knowl­edge in solid steps. Which I guess, some­times re­quires that you do test your limits to find out where you re­ally are.

The next book I start on from the re­search guide is bound to be Com­putabil­ity and Logic.

• When I first worked through this book, it didn’t re­sult in long-term re­ten­tion of the ma­te­rial (I’m sure some peo­ple will be able to man­age, just not me, not with­out med­i­tat­ing on it much longer than it takes to work through or set­ting up a spaced rep­e­ti­tion sys­tem). In that re­spect, En­der­ton’s Ele­ments of Set The­ory worked much bet­ter. En­der­ton’s book goes into more de­tail, giv­ing enough time to ex­er­cise in­tu­ition about stan­dard proofs. At the same time, it’s an eas­ier read, which might be helpful if Hal­mos’s text seems difficult.

• In gen­eral, read­ing about the same sub­ject from a differ­ent au­thor is a great way to learn and re­tain the ma­te­rial bet­ter. This is true even if nei­ther au­thor is ob­jec­tively “bet­ter” than the other. Some­thing about rec­og­niz­ing the same un­der­ly­ing con­cept ex­pressed in differ­ent words helps to fix that con­cept in the mind.

It’s pos­si­ble to ex­ploit this phe­nomenon even when you have only one text to work with. One trick I use when work­ing through a math text is to willfully use differ­ent no­ta­tion in my notes next to the text. Us­ing a differ­ent no­ta­tion forces me to make sure that I’m re­ally fol­low­ing the de­tails of the ar­gu­ment. Ex­press­ing the same logic in differ­ent sym­bols makes it eas­ier to see through those sym­bols to the un­der­ly­ing logic.

• The au­thor of the Teach Your­self Logic study guide agrees with you about read­ing mul­ti­ple sources:

I very strongly recom­mend tack­ling an area of logic (or in­deed any new area of math­e­mat­ics) by read­ing a se­ries of books which over­lap in level (with the next one cov­er­ing some of the same ground and then push­ing on from the pre­vi­ous one), rather than try­ing to pro­ceed by big leaps.

In fact, I prob­a­bly can’t stress this ad­vice too much, which is why I am high­light­ing it here. For this ap­proach will re­ally help to re­in­force and deepen un­der­stand­ing as you re-en­counter the same ma­te­rial from differ­ent an­gles, with differ­ent em­phases.

• Thanks for the tip. Two other books on the sub­ject that seem to be ap­pre­ci­ated are In­tro­duc­tion to Set The­ory by Karel Hr­bacek and Clas­sic Set The­ory: For Guided In­de­pen­dent Study by Derek Goldrei.

Edit: math.se weighs in: http://​​math.stack­ex­change.com/​​a/​​264277/​​255573

• I finished this book about four months ago, and time is mak­ing me in­creas­ingly glad that I read it. In par­tic­u­lar, its treat­ment of countable in­fini­ties, func­tions, proof by in­duc­tion, and the Peano ax­ioms have been worth their weight in gold. When I en­counter similar sub­jects ‘out in the wild’, I can ap­proach them with rel­a­tive skill and trust my in­tu­itions in a way that I couldn’t be­fore. It’s re­ally grow­ing on me.

That said, as a near-in­tro­duc­tion to set the­ory, it was a very difficult read at times. It was a treat­ment of math­e­mat­ics far deeper than I had come to ex­pect from my uni­ver­sity courses (which were largely in con­tin­u­ous math­e­mat­ics, ac­cord­ing to an­cient en­g­ineers’ tra­di­tion). If school has trained you to ap­proach math­e­mat­i­cal sub­jects as a tool the same way it did me, you’ll need to ad­just your ex­pec­ta­tions. This book is about vir­tu­os­ity, not just sur­vey­ing the tools.

• The in­verses Hal­mos defines here are more gen­eral than the in­verse func­tions de­scribed on wikipe­dia. Hal­mos’ in­verses work even when the func­tions are not bi­jec­tive.

I be­lieve that what you are speak­ing of here is Hal­mos’s dis­course on what are called these days “images and preimages” or “in­verse images”. I found the sub­tle differ­ence be­tween these and in­verse func­tions proper an­noy­ing when I was learn­ing proof writ­ing, so let me illus­trate the con­cept, so that we have a caveat emp­tor for the bud­ding math­e­mat­i­cian.

Take the sets A = {0, 1} and B = {2}, and define a func­tion f: A → B as f(x) = 2 for what­ever x in A you throw in there.

Then,

• f(0) = 2, of course.

• f(1) = 2, as well.

• f(A) = {2}, which is the image of the whole set A “un­der” the func­tion f.

• f^{-1}(B) = {0, 1}, which is the pre-image of the whole set of B un­der f. Mean­ing, “any­thing I can throw into f, from A, to get some­thing in B”.

• f^{-1}(2) , how­ever, is mean­ingless, at least as far as func­tions go. Func­tions can only re­turn one thing, so how would you de­cide whether f^{-1}(2) should give you back 0 or 1?

If you say f^{-1}(2) should give back both, well, now you’re not deal­ing with an in­verse func­tion any more, you’re deal­ing with the in­verse re­la­tion. You can in fact deal with that, with some other tools in the book

Th­ese are more gen­eral, which is nice, but I’ve found that in a rigor­ous en­vi­ron­ment it won’t do to de­scribe them with the same lan­guage you use with func­tions. You re­ally want to toy with these gen­tly, if you can.

• Are you sure that by “one-to-one” Hal­mos means “bi­jec­tive”? A more com­mon us­age is for it to mean “in­jec­tive”. (But I don’t have NST and maybe he has an un­usual idiom.)

• There is a con­ven­tion ac­cord­ing to which a one-to-one func­tion is in­jec­tive, while a one-to-one cor­re­spon­dence is an in­jec­tive func­tion that is also sur­jec­tive, ie, a bi­jec­tion. (I don’t know whether Hal­mos uses this con­ven­tion.)

• Oh yes, for sure, but the con­text here was a state­ment that “onto” means sur­jec­tive while “one-to-one” means bi­jec­tive. Definitely talk­ing func­tions. And I would be re­ally sur­prised if Hal­mos were us­ing “one-to-one” fol­lowed by any­thing other than “cor­re­spon­dence” to mean bi­jec­tive.

• You guys must be right. And wikipe­dia cor­rob­o­rates. I’ll edit the post. Thanks.

• Looks to me like Hal­mos does in­tend “one-to-one” to mean “in­jec­tive”. What he writes is “A func­tion that always maps dis­tinct el­e­ments onto dis­tinct el­e­ments is called one-to-one (usu­ally a one-to-one cor­re­spon­dence).” Then he men­tions in­clu­sion maps as ex­am­ples of one-to-one func­tions.

• My two main sources of con­fu­sion in that sen­tence are:

1. He says “dis­tinct el­e­ments onto dis­tinct el­e­ments”, which sug­gests both in­jec­tion and sur­jec­tion.

2. He says “is called one-to-one (usu­ally a one-to-one cor­re­spon­dence)”, which might sug­gest that “one-to-one” and “one-to-one cor­re­spon­dence” are syn­onyms—since that is what he usu­ally uses the paran­the­ses for when nam­ing con­cepts.

I find Hal­mos some­what con­tra­dic­tory here.

But I’m con­vinced you’re right. I’ve ed­ited the post. Thanks.

• It is some­what con­fus­ing, but re­mem­ber that sru­jec­tivity is defined with re­spect to a par­tic­u­lar codomain; a func­tion is sur­jec­tive if its range is equal to its codomain, and thus whether it’s sur­jec­tive de­pends on what its codomain is con­sid­ered to be; ev­ery func­tion maps its do­main onto its range. “f maps X onto Y” means that f is sur­jec­tive with re­spect to Y”. So, for in­stance, the ex­po­nen­tial func­tion maps the real num­bers onto the pos­i­tive real num­bers. It’s sur­jec­tive with re­spect to pos­i­tive real num­bers*. Say­ing “the ex­po­nen­tial func­tion maps real num­bers onto real num­bers” would not be cor­rect, be­cause it’s not sur­jec­tive with re­spect to the en­tire set of real num­bers. So say­ing that a one-to-one func­tion maps dis­tinct el­e­ments onto a set of dis­tinct el­e­ments can be con­sid­ered to be cor­rect, albeit not as clear as say­ing “to” rather than “onto”. It also suffer from a lack of clar­ity in that it’s not clear what the “always” is sup­posed to range over; are there func­tions that some­times do map dis­tinct el­e­ments to dis­tinct el­e­ments, but some­times don’t?