# Book Review: Naïve Set Theory (MIRI course list)

I’m re­view­ing the books on the MIRI course list. I fol­lowed Cat­e­gory The­ory with Naïve Set The­ory, by Paul R. Hal­mos.

# Naïve Set Theory

This book is tiny, con­tain­ing about 100 pages. It’s quite dense, but it’s not a difficult read. I’ll re­view the con­tent be­fore giv­ing my im­pres­sions.

## Chap­ter List

1. The Ax­iom of Extension

2. The Ax­iom of Specification

3. Unordered Pairs

4. Unions and Intersections

5. Com­ple­ments and Powers

6. Ordered Pairs

7. Relations

8. Functions

9. Families

10. In­verses and Composites

11. Numbers

12. The Peano Axioms

13. Arithmetic

14. Order

15. The Ax­iom of Choice

16. Zorn’s Lemma

17. Well Ordering

18. Trans­finite Recursion

19. Or­di­nal Numbers

20. Sets of Or­di­nal Numbers

21. Or­di­nal Arithmetic

22. The Schröder—Bern­stein Theorem

23. Countable Sets

24. Car­di­nal Arithmetic

25. Car­di­nal Numbers

Nor­mally I’d sum­ma­rize each chap­ter, but chap­ters were about four tiny pages each and the con­tent is mostly de­scribed by the chap­ter name. Zorn’s Lemma states that if all chains in a set have an up­per bound, then the set has a max­i­mal el­e­ment. (This fol­lows from the ax­iom of choice.) The Schröder-Bern­stein The­o­rem states that if X is equiv­a­lent to a sub­set of Y, and Y is equiv­a­lent to a sub­set of X, then X and Y are equiv­a­lent. The other chap­ter ti­tles are self-ev­i­dent.

Each chap­ter pre­sented the con­cepts in a con­cise man­ner, then worked through a few of the im­pli­ca­tions (with proofs), then pro­vided a few short ex­er­cises.

None of the con­cepts within were par­tic­u­larly sur­pris­ing, but it was good to play with them first-hand. Most use­ful was in­ter­act­ing with or­di­nal and car­di­nal num­bers. It was nice to ex­am­ine the ac­tual struc­ture of each type of num­ber (in set the­ory) and deepen my pre­vi­ously-su­perfi­cial knowl­edge of the dis­tinc­tion.

## Discussion

Be­fore div­ing in to the re­view it’s im­por­tant to re­mem­ber that the use­ful­ness of a math text­book is heav­ily de­pen­dent upon your math back­ground. I have a mod­er­ately strong back­ground. Some spe­cific sub­jects (anal­y­sis, type the­ory, group the­ory, etc.) have given me a solid, if in­di­rect, foun­da­tion in set the­ory. This was the first time I stud­ied set the­ory di­rectly, but the con­cepts were hardly new.

### Overview

I was pleased with this book. It is terse. It has ex­er­cises. It gives you in­for­ma­tion and gets out of your way, which is what I like in a text­book: It doesn’t waste your time. I’m about to harp on the book for a spell, but please keep in mind that my over­all feel­ing was pos­i­tive.

Please take these re­views with a grain of salt, as sam­ple size is 1 and I have not read any similar text­books.

### Complaints

• The book was writ­ten in 1960, and it shows. Set the­ory is more ma­ture now than it was then. The au­thors of­ten re­mark on syn­tax that was not yet stan­dard (which is now com­mon­place). The con­tinuum hy­poth­e­sis had not yet been proven un­prov­able in ZFC. The ax­iom of choice is em­braced whole­heart­edly with no dis­cus­sion of weaker var­i­ants. The style of proof differs from the mod­ern style. None of this is bad, per se. In fact, it’s quite a fas­ci­nat­ing time cap­sule: I en­joyed see­ing a slice of math­e­mat­ics from half a cen­tury past. How­ever, I be­lieve a more mod­ern in­tro­duc­tion to set the­ory could have taught me more per­ti­nent math­e­mat­ics in the same amount of time.

• The no­ta­tion is in­con­sis­tent. I’ve long be­lieved that math is a poor and in­con­sis­tent lan­guage. This is ev­i­dent through­out set the­ory. To the au­thor’s credit, they point out many of the in­con­sis­ten­cies: f(A) can re­fer to both a func­tion or a re­stric­tion of a func­tion to the sub­set A of its do­main, 2^w can re­fer to ei­ther func­tions map­ping w onto 2 or a spe­cific or­di­nal num­ber, etc. I am per­son­ally of the opinion that in­tro­duc­tory text­books should en­force a pure & con­sis­tent syn­tax (which may be re­laxed in prac­tice). I was mildly an­noyed with how the au­thors ac­knowl­edged the in­con­sis­ten­cies and then em­braced them, thereby per­pet­u­at­ing a memetic tragedy of the com­mons. (I know that I shouldn’t ex­pect bet­ter, but one can dream.)

• The proofs given were pri­mar­ily in en­glish. Not once did the au­thors write ∃ or ∀. They would re­sort to “for some” or “for any” in largely en­glish-lan­guage proofs. The proofs were rigor­ous (the au­thors tightly re­stricted their en­glish phrases), but I was some­what sur­prised to find the ax­ioms of set the­ory de­scribed in lin­gual (rather than sym­bolic) form.

• Set the­ory is ax­iom soup. I do not view set the­ory as foun­da­tional. Is the ax­iom of choice true? The ques­tion is poorly formed. Ax­ioms are tools to con­strain what you’re talk­ing about. Bet­ter ques­tions are shaped like “does the ax­iom of choice ap­ply to this thing I’m work­ing with?”, or “how does the struc­ture change if we take this state­ment as an ax­iom?”. This sen­ti­ment seems fairly com­mon in mod­ern math­e­mat­ics, but it was lack­ing in Naïve Set The­ory. Ax­ioms were pre­sented as facts, not tools. There was lit­tle ex­plo­ra­tion of each ax­iom, what it cost and what it bought, and what al­ter­nate forms are available.

Most of these gripes are small com­pared to the amount of good data in the book. Re­mem­ber that the book is ti­tled Naïve Set The­ory: a lit­tle naïvety is to be ex­pected. The take­away is that the book was good, but likely could have been bet­ter in light of mod­ern math­e­mat­ics. All in all, the book cov­ers lot of ground at a fast clip, and was quite use­ful.

## Should I learn set the­ory?

As always, it de­pends upon your goals. Set the­ory is ev­ery­where in math­e­mat­ics, and I per­son­ally ap­pre­ci­ated shoring up my foun­da­tions. If you have similar goals, you can eas­ily go through this book in a week if you think that learn­ing set the­ory is worth your time.

I don’t par­tic­u­larly recom­mend set the­ory to arm­chair math­e­mat­i­ci­ans. In my ex­pe­rience, other ar­eas of math­e­mat­ics are much more fun from a ca­sual stand­point. (Group the­ory and in­for­ma­tion the­ory come to mind, if you’re look­ing for a good time.)

## Should I read this book?

Maybe. I have no point of com­par­i­son here. My ten­ta­tive sug­ges­tion is that you should find a more mod­ern (but similarly terse) in­tro­duc­tory text­book and read that in­stead. (If you have a good sug­ges­tion, you should leave it in the com­ments.)

I found this book to be rather ba­sic. If you have a back­ground similar to mine, I recom­mend some­thing a lit­tle more ad­vanced. (Un­for­tu­nately, I can make no recom­men­da­tions. Again, com­ments are wel­come.)

This book seems well-suited for a layper­son in­ter­ested in learn­ing set the­ory. The 1960s feel is definitely fun. I would guess that the book is well-paced for some­one who has done the stan­dard col­lege calcu­lus courses but is un­fa­mil­iar with Set The­ory sub­ject mat­ter.

If you’re go­ing to read the book then I sug­gest read­ing the whole thing. It builds from first prin­ci­ples up to car­di­nal­ity, and noth­ing along the way is unim­por­tant. My only sug­ges­tion is that you swap chap­ter 25 and 24: they ap­pear to have been or­dered in­cor­rectly for poli­ti­cal rea­sons. (The deriva­tion of car­di­nal num­bers used in chap­ter 25 was, at the time, con­tro­ver­sial, so the book pre­sents car­di­nal ar­ith­metic be­fore car­di­nal num­bers.) Other than that, the book was well struc­tured.

## Fi­nal Notes

If a com­pa­rably short-and-sweet text­book writ­ten in the last twenty years can be found, I recom­mend up­dat­ing the sug­ges­tion on the MIRI course list. It’s not clear to me how much raw set the­ory is use­ful in mod­ern AI re­search; my wild guess is that math­e­mat­i­cal logic, model the­ory, and prov­abil­ity the­ory are more im­por­tant. If that is the case, then I think the tech­ni­cal level of this book is ap­pro­pri­ate for the course list: it’s suffi­cient to brush up on the ba­sics, but it doesn’t send you deep into rab­bit holes when there are more in­ter­est­ing top­ics on the hori­zon.

My next re­view will take more time than did the pre­vi­ous four. I have a num­ber of loose ends to tie up be­fore jump­ing in to Model The­ory, and I have much less fa­mil­iar­ity with the sub­ject mat­ter.

• Th­ese book re­views are badass.

That is all.

• Thanks for the great re­view! Your tip to swap 24 and 25 was helpful, as was your warn­ing about in­con­sis­tent no­ta­tion. How­ever, one benefit of “in­con­sis­tent no­ta­tion” is that it re­ally forces you to de­velop a clear un­der­stand­ing.

Over­all, I got a lot out of this. Naive Set The­ory clar­ified a lot of foun­da­tional con­cepts I had pre­vi­ously taken for granted. It also made me crack up at times; for ex­am­ple:

• The slight feel­ing of dis­com­fort that the reader may ex­pe­rience in con­nec­tion with the defi­ni­tion of nat­u­ral num­bers is quite com­mon and in most cases tem­po­rary.

• We want to be told that the suc­ces­sor of 7 is 8, but to be told that 7 is a sub­set of 8 or that 7 is an el­e­ment of 8 is dis­turb­ing.

I per­son­ally found the trick­iest part to be the proof of Zorn’s lemma. So for pos­ter­ity, here’s a sketch of the proof that might be helpful for fol­low­ing the full proof given in the text:

Zorn’s lemma. Let X be a par­tially-or­dered set such that ev­ery chain in X has an up­per bound (in X); then X has a max­i­mal el­e­ment.

Proof sketch.

• Let S be col­lec­tion of weak ini­tial seg­ments of el­e­ments of X, or­dered by set in­clu­sion; show that if S has a max­i­mal set, then X has a max­i­mal element

• Let C be the col­lec­tion of all chains in X, or­dered by set in­clu­sion; show that if C has a max­i­mal set, then S has a max­i­mal set (the text uses a script X in place of C)

• Use the ax­iom of choice to con­struct an “ex­ten­sion” func­tion g on C that ex­tends a non-max­i­mal set by one el­e­ment, and leaves a max­i­mal set unchanged

• Define a spe­cial kind of sub­set of C called a tower

• The defi­ni­tion of a tower is in­cred­ibly clever, and it rigor­ously de­scribes the in­tu­itive idea of “keep adding el­e­ments un­til you get a max­i­mal set”

• Let t be the “small­est pos­si­ble tower” (i.e. the in­ter­sec­tion of all tow­ers), and let A be the union of ev­ery set in t; show that g leaves A unchanged

• Con­clude that C has a max­i­mal set (A); thus S has a max­i­mal set; thus X has a max­i­mal element

Fi­nally, here’s the full list of in­gre­di­ents in the ax­iom soup (note that the Peano “ax­ioms” are ac­tu­ally proved, not taken as ax­ioms):

• Ax­iom of ex­ten­sion (page 2): Two sets are equal if and only if they have the same el­e­ments.

• Ax­iom of speci­fi­ca­tion (page 6): To ev­ery set A and to ev­ery con­di­tion S(x) there cor­re­sponds a set B whose el­e­ments are ex­actly those el­e­ments x of A for which S(x) holds.

• Ax­iom of pairing (page 9): For any two sets there ex­ists a set that they both be­long to.

• Ax­iom of unions (page 12): For ev­ery col­lec­tion of sets there ex­ists a set that con­tains all the el­e­ments that be­long to at least one set of the given col­lec­tion.

• Ax­iom of pow­ers (page 20): For each set there ex­ists a col­lec­tion of sets that con­tains among its el­e­ments all the sub­sets of the given set.

• Ax­iom of in­finity (page 44): There ex­ists a set con­tain­ing 0 and con­tain­ing the suc­ces­sor of each of its el­e­ments.

• Ax­iom of choice (page 59): The Carte­sian product of a non-empty fam­ily of non-empty sets is non-empty.

• Ax­iom of sub­sti­tu­tion (page 75): If S(a, b) is a sen­tence such that for each el­e­ment a in the set A the set {b : S(a, b)} can be formed, then there ex­ists a func­tion F with do­main A such that F(a) = {b : S(a, b)} for each a in A.

• I used your ax­iom list and Zorn’s lemma proof sketch to make Mnemosyne cards. Thanks a bunch!

• Thanks for the re­view.

A more re­cent book on Set The­ory: Ba­sic Set The­ory—A. Shen, In­de­pen­dent Univer­sity of Moscow, and N. K. Vereshcha­gin, Moscow State Lomonosov Univer­sity—AMS, 2002, 116 pp., Soft­cover, ISBN-10: 0-8218-2731-6, ISBN-13: 978-0-8218-2731-4, List: US\$24, All AMS Mem­bers: US\$19.20, STML/​17

I found it in the Amer­i­can Math­e­mat­i­cal So­ciety for Stu­dent’s se­ries, which is highly recom­mended on math­overflow.com: http://​​www.ams.org/​​book­store/​​stmlseries

• Thanks for the recom­men­da­tion! We’ll check it out.

• I no­ticed that the course list doesn’t cover sev­eral top­ics that are pop­u­lar on LW. Some sug­ges­tions:

Game the­ory—Fu­den­berg and Tirole

K-com­plex­ity—Li and Vitanyi

Causal­ity—Pearl

And maybe some­thing on cryp­tog­ra­phy, but I don’t know enough about it to recom­mend a good book.

• For cryp­tog­ra­phy I would recom­mend Fer­gu­son, Sch­neier, & Kohno’s Cryp­tog­ra­phy Eng­ineer­ing. It’s aimed at en­g­ineers so it’s not so much the math-ori­ented text that you might ex­pect from a MIRI course list, but that’s very much on pur­pose by the au­thors and in my recom­men­da­tion. The prin­ci­ple ap­pli­ca­tion of cryp­tog­ra­phy to friendly AI the­ory is the prag­matic dis­ci­pline of de­sign­ing and im­ple­ment­ing se­cure pro­to­cols. Most of the les­sons to be learned here is not in the math, but rather the right ad­ver­sar­ial mind­set for think­ing about se­cu­rity prob­lems—what Sch­neier calls “pro­fes­sional para­noia.” Im­part­ing this mind­set on new learn­ers of the field was a driv­ing fac­tor for the au­thors in writ­ing this text­book.

Be­sides, un­less you are a pro­fes­sional cryp­tog­ra­pher, you should not be de­sign­ing your own crypto pro­to­cols. And un­less you have sig­nifi­cant peer re­view, you should not be us­ing them. The key is to un­der­stand the ba­sic fun­da­men­tals of the field, in­ter­nal­ize the ad­ver­sar­ial mind­set, and then learn enough math (mostly group the­ory) to read the aca­demic pa­pers di­rectly.

• Do you think Causal­ity is a su­pe­rior recom­men­da­tion to Prob­a­bil­is­tic Graph­i­cal Models?

• The ma­te­rial cov­ered in Causal­ity is more like a sub­set of that in PGM. PGM is like an en­cy­clo­pe­dia, and Causal­ity is a com­pre­hen­sive in­tro­duc­tion to one ap­pli­ca­tion of PGMs.

• Thanks. That was what I thought, but I haven’t read Causal­ity yet.

• I haven’t read PGM. Maybe you could ask Ilya Sh­pitser, he knows this stuff much bet­ter than I do.

• Not once did the au­thors write ∃ or ∀.

Use of these sym­bols is weakly dis­cour­aged in pub­lished math­e­mat­i­cal writ­ing, as is the use of log­i­cal con­nec­tives such as ∧,∨, and ⇒. The sen­ti­ment seems to be that you gen­er­ally shouldn’t use sym­bols from for­mal logic un­less you are ac­tu­ally writ­ing out for­mu­las within an ex­plic­itly es­tab­lished for­mal log­i­cal the­ory, with ex­plic­itly es­tab­lish rules of syn­tax and in­fer­ence.

• In those terms, what sur­prised me was that the au­thors did not ex­plic­itly es­tab­lish a for­mal log­i­cal the­ory of sets. (I also ex­pected ex­plicit syn­tax and in­fer­ence in the proofs.) Is for­mal-log­i­cal set the­ory frowned upon as well?

• As I un­der­stand the phrase, it wouldn’t be “naive set the­ory” if they did that.

• Ah, I didn’t know that that “naive” car­ried the con­no­ta­tion of “non-for­mal” in this con­text. This is good to know, thanks.

• In my ex­pe­rience, “We’re do­ing naive set the­ory” means some­thing like, “We’ll as­sume, with­out fur­ther jus­tifi­ca­tion, that no Rus­sell-style para­dox ap­plies to any pred­i­cate P where we will ac­tu­ally want to write {x : P(x)}. We’ll just as­sume the ex­is­tence of a set an­swer­ing to this de­scrip­tion for any P that we need. We know that there are pred­i­cates for which this is not al­lowed, but we’ll just hope that ev­ery­thing works out okay in the cases where we do it.”

The phrase “naive set the­ory” also con­notes a cer­tain cav­a­lier­ness about whether the el­e­ments in one’s sets are them­selves con­structed out of sets (as in ZF) or whether in­stead one is work­ing with ure­le­ments (ob­jects in sets that are not them­selves sets).

• This sen­ti­ment seems fairly com­mon in mod­ern math­e­mat­ics, but it was lack­ing in Naïve Set The­ory. Ax­ioms were pre­sented as facts, not tools. There was lit­tle ex­plo­ra­tion of each ax­iom, what it cost and what it bought, and what al­ter­nate forms are available

Ac­tu­ally that’s a quite wide­spread at­ti­tude in the set the­ory com­mu­nity, not only in that book. Just con­sider that Hamk­ins’ pro­posal of a mul­ti­verse (that is, a plu­ral­ity of mod­els for the ax­ioms of set the­ory), is con­sid­ered con­tro­ver­sial. Maybe in­fluen­tial to this state of af­fairs was a more pla­ton­ist ap­proach of the founders, who re­garded ZF(C) as a way to de­scribe the in­tu­ititve con­cept of set in­stead of just an­other for­mal tool.

• One of the rea­sons I think Car­nap is un­der­rated (though there’s a wel­come re­vival of late); already way back in the 30s he was preach­ing “in logic there are no morals!”

• Your re­views are fun to read, and as soon as I can get time be­tween as­sign­ments and tests to get into a few of the MIRI books I will try to see if I can get through these as well. Thanks for mak­ing the effort to write about these.

• Nice re­view! I am ac­tu­ally read­ing through this one now. I’ve always felt like set the­ory is one of those one-point won­ders of sci­ence—dig­ging in deeply doesn’t give you much benefit, but the ba­sic stuff is the stuff you are go­ing to run into pretty much ev­ery­where. Guess I’ll have to see what I think af­ter I read all the way through.