Naive Set Theory, Halmos
Epistemic status: Unorganized thoughts of mine generated after reading this book—not a concerted attempt to convey any single particular mathematical insight.
Set theory is the mathematical study of collections. These collections—sets—can be either the collection of nothing (the empty set), or collections… of other collections. Countries, for example, are sets of people, making the United Nations a set of sets of people.
Somewhat surprisingly, just about all the rest of mathematics can be framed as the study of sets (of sets of sets of sets… ). Knowing some set theory lets you translate the myriad objects and claims that you encounter elsewhere in the various areas of math into more set theory, putting math on one common intellectual foundation.
Contents and Notes
1. The Axiom of Extension
2. The Axiom of Specification
All the basic principles of set theory, except only the axiom of extension, are designed to make new sets out of old ones (p. 4).
Reading this book, I picked up the idea that sets with arbitrary properties aren’t simply assumed to exist in math. As Halmos puts it,
It is impossible, especially in mathematics, to get something for nothing. To specify a set, it is not enough to pronounce some magic words (which may form a sentence such as “”); it is necessary also to have at hand a set to whose elements the magic words apply (p. 7).
Another related “correctly orienting towards mathematics” maxim along these lines is: “assume nothing, prove nothing.” You don’t do math without any premises, because nothing interesting follows from no premises at all. Nor do you freely assume arbitrary premises and see what follows deductively from them. Sets in set theory are not generally assumed to exist—with the sole exception of the axiom of infinity, our existential starting point. Sets in set theory are instead constructed in steps, from the axiom of infinity, using the existential conditionals of the remaining axioms of set theory.
3. Unordered Pairs
4. Unions and Intersections
5. Complements and Powers
6. Ordered Pairs
The ordered pair of and , with the first coordinate and second coordinate , is the set defined by
Sets do not natively track the ordering of their elements: . We can introduce the notion of ordering of elements, though, in the above fashion.
You can begin to see from the above treatment how something like a shape in the Cartesian plane can be formalized as an unwieldly set.
Functions can be understood as sets too. Think of a function as a database (potentially infinitely large) of ordered pairs, with representing elements in its domain and elements in its range. and can be anything, so so long as and themselves can be formalized as sets so can the functions defined on them.
Since this is not the appropriate occasion from lengthy heuristic preliminaries, we shall proceed directly to the definition, even at risk of temporarily shocking or worrying some readers. Here it is: we define , , and by writing
In other words, is empty, is the singleton , and is the pair . Observe that there is some method in this apparent madness; the number of elements in the sets , , or … is, respectively, zero, one, or two (pp. 32-3).
Thus, natural numbers are constructable as sets, and so are straightforward functions from the naturals to the naturals.
I wish the terms injection, surjection, and bijection made an appearance in the book, in place of one-to-one, onto, and one-to-one correspondence, respectively. I tripped over the latter terms a bit (“Wait a second… was that supposed to be an injection or a bijection?”).
Suppose, for instance, that is a function from a set to a set . (The very choice of letters indicates that something strange is afoot.) An element of the domain is called an index, is called the index set, the range of the function is called an indexed set, the function itself is called a family, and the value of the function at an index , called a term of the family, is denoted by . (This terminology is not absolutely established, but it is one of the standard choices among related slight variants; in the sequel it and it alone will be used.) An unacceptable but generally accepted way of communicating the notation and indicating the emphasis is to speak of a family in , or of a family of whatever the elements of may be; when necessary, the index set is indicated by some such parenthetical expression such as (p. 34).
The above “unacceptable” notation for a family did indeed trip me up a few times. It’s easily the worst piece of notation in the book. (You’d guess that a symbol like would equal the set containing only , but this is not at all the case!)
10. Inverses and Composites
This chapter got me to actually think somewhat carefully about associativity:
Let be a function from to , and its inverse.
The algebraic behavior of is unexceptionable. If is a family of subsets of , then
Associativity seemed unobjectionable and uninteresting in high-school algebra. What was missing then, I think, was an appreciation of how so many functions irreversibly throw away information about the object in question. Associativity means something like “no information about the object we’re discussing is discarded by these functions, so you may evaluate the functions in any order and always come to the same terminal conclusion.”
If you want the set of all people on the planet, it doesn’t matter whether you pool all the countries first into the UN, then count the UN’s population, or you first count every country’s individual population, then sum those population counts. Neither approach “changes the subject,” so you get to the same figure in the end.
The successor of a set is the set of all the elements of together with itself:
From what has been said so far it does not follow that the construction of successors can be carried out ad infinitum within one and the same set. What we need is a new set-theoretic principle.
Axiom of infinity. There exists a set containing and containing the successor of each of its elements.
We shall say, temporarily, that a set is a successor set if and if whenever . In this language the axiom of infinity simply says that there exists a successor set . Since the intersection of every (non-empty) family of successor sets is a successor set itself (proof?), the intersection of all the successor sets included in is a successor set . The set is a subset of every successor set. If, indeed, is an arbitrary successor set, then so is . Since , the set is one of the sets that entered into the definition of ; it follows that , and, consequently, that . The minimality property so established uniquely characterizes ; the axiom of extension guarantees that there can be only one successor set that is included in every other successor set. A natural number is, by definition, an element of the minimal successor set . This definition of natural numbers is the rigorous counterpart of the intuitive description according to which they consist of , , , , “and so on” (p. 44).
Incidentally, there’s a lot of that pattern of construction in this book. First, establish that there exists some set with some property. Then, use the axiom of specification (or some other means) to pinpoint just its subset with the desired property (excluding potential misc. elements).
12. The Peano Axioms
Remember, a partial order on a set is a reflexive, transitive, and antisymmetric relation in , while an equivalence relation on is a reflexive, transitive, and symmetric relation in X.
15. The Axiom of Choice
Starting here, we turn our studies to infinite set theory.
We have seen that if a collection of sets we are choosing from is finite, then the possibility of simultaneous choice is an easy consequence of what we knew before the axiom of choice was even stated; the role of the axiom is to guarantee that possibility in infinite cases...
Every infinite set has a subset equivalent to … A set is infinite if and only if it is equivalent to a proper subset of itself (pp. 60-1).
16. Zorn’s Lemma
17. Well Ordering
18. Transfinite Recursion
19. Ordinal Numbers
The chief application of the axiom of substitution is in extending the process of counting far beyond the natural numbers. From the present point of view, the crucial property of a natural number is that it is a well ordered set such that the initial segment determined by each element is equal to that element… An ordinal number is defined as a well ordered set such that for all in ; here is, as before, the initial segment .
An example of an ordinal that is not a natural number is the set consisting of all the natural numbers (p. 75).
Ordinal numbers are the rigorous take on the elementary-school game of counting higher than infinity.
Imagine : all the naturals, ordered by . Now imagine : all the natural numbers, ordered by except for , which is just defined to be higher in the relation than any other natural. You can see that there’s a similarity between the naturals beginning at in and the naturals beginning at in the second ordering. (There exist infinitely many bijections from to , of course, but that’s not enough to get a similarity here—we’re specifically after a monotone bijection.) Since is left out of the similarity, the ordinal number similar to is larger than the ordinal number similar to .
In ordinal numbers, we express this
(Please note that addition here isn’t quite addition in the familiar sense: .)
20. Sets of Ordinal Numbers
21. Ordinal Arithmetic
22. The Schröder–Bernstein Theorem
Here’s the most beautiful proof in the book:
If X and Y are sets such that X is equivalent to a subset of Y, we shall write
…say that Y dominates X...
Schröder–Bernstein Theorem. If and , then .
Proof. Let be a one-to-one mapping from into and let be a one-to-one mapping from into ; the problem is to construct a one-to-one correspondence between and . It is convenient to assume that the sets and have no elements in common; if that is not true, we can so easily make it true that the added assumption involves no loss of generality.
We shall say that an element in is the parent of the element in and, similarly, that an element in is the parent of in . Each element of has an infinite sequence of descendants, namely, , , , etc., and similarly, the descendants of an element of are , , , etc. This definition implies that each term in the sequence is a descendant of all the preceding terms; we shall also say that each term in the sequence is an ancestor of all following terms.
For each element (in either or ) one of three things must happen. If we keep tracing the ancestry of the element back as far as possible, then either we ultimately come to an element of that has no parent (these orphans are exactly the elements of ), or we ultimately come to an element of that has no parent ), or the lineage regresses ad infinitum. Let be the set of those elements of that originate in (i.e., consists of the elements of together with all their descendants in ), let be the set of those elements of that originate in (i.e., consists of all the descendants in of the elements of ), and let be the set of those elements of that have no parentless ancestor. Partition similarly into the three sets , , and .
If , then , and, in fact, the restriction of to is a one-to-one correspondence between and . If , then belongs to the domain of the inverse function and ; in fact the restriction of to is a one-to-one correspondence between and . If, finally, , then , and the restriction of to is a one-to-one correspondence between and ; alternatively, if , then , and the restriction of to is a one-to-one correspondence between and . By combining these three one-to-one correspondences, we obtain a one-to-one correspondence between and (pp. 88-9).
For whatever subjective reason, I really aesthetically enjoy the exhaustion-by-sequence-tracing character of this proof!
23. Countable Sets
24. Cardinal Arithmetic
25. Cardinal Numbers
Cardinals are least ordinals equivalent to some set .
Note the connection to the discussion of bijections and similarities above, in §19: and could be equivalent to the same cardinal number but similar to distinct ordinal numbers.
The symbol is defined to mean the empty set, .
In the sense of there existing an equivalence relation in that set.
A well order on a set is (1) a partial order that (2) connects any two elements in (is total) and (3) has a least element for each subset .
Thus, partial orders are less demanding than total orders are less demanding than well orders.
An initial segment of a partially ordered set is the subset of
A similarity is a bijection between sets and such that when , then
Note that the former expression is evaluated in and the latter in .
Similarities between two well-ordered sets are unique (p. 72).
Two sets and are equivalent, , when there exists a bijection between and . If and are subsets of , then equivalence in this sense is an equivalence relation in the power set (p. 52).
You can easily make overlapping sets disjoint by creating a bijection between each element and , respectively.
(I haven’t read this specific book, but I have read other books about sets.)
For starters, what exactly does the word “naive” mean, in the context of Naive Set Theory? It seems that different authors use this word a little differently, but generally it means: “I allow myself to skip the technical details when they become boring, and also the difficult topics because this is a book for beginners”.
I think the class of sets whose existence can be proved from axioms is called “constructible universe”. This is the minimum that must exist. But the axioms are not saying that other sets do not exist.
So if you can imagine other sets, whose existence is neither provable from the axioms, nor does it contradict them… then such sets may or may not exist. More precisely speaking, now we have multiple possible meanings of the word “set”. (Of course, if you add some new set, you also need to add all those sets whose existence can be proved from the axioms and the existence of this set.) This is basically what it means that some statements are not decidable from the axioms: depending on what meaning you choose, such statements can be either true or false.
I guess a specific example would be useful here. We have a set of all natural numbers. Now suppose that God flips a coin for each of these numbers, and creates a set “G” of those where the coin came up heads. (There were infinitely many coin flips. The coin is fair, so there were infinitely many heads, infinitely many tails, and there is no finite text that could describe the results exactly.) Does the set “G” exist? Your intuition might say yes, but it is not provable from the axioms, because all proofs have finite length, and therefore cannot specify objects that do not have a finite definition.
Each natural number is represented by a set containing representations of all smaller numbers. This is convenient because it allows to express “a < b” as “the representation of a is a subset of the representation of b”. And it makes the definition of a successor very simple. Furthermore, this also works for (some?) infinite sets; for example ω is the smallest ordinal greater than all integers, and simultaneously a set of all integers (i.e. the smallest set containing all integers). -- In other words, for certain sets, you can use “smaller” and “subset” as synonyms; prove one statement, get the other statement for free.
Generally, it happens many times that something is obvious for finite sets, but for infinite sets it might be tricky to prove, or not true at all. You could see an infinite chain of arguments where each step is a correct proof and yet it does not help you, because proofs cannot be infinitely long.
Example, as an intuition pump: Imagine that there is a function that receives a set of natural numbers, and returns a natural number. We know that f(empty set) = 42, and f(any set with one element) = 42, and also for each two sets A and B if f(A) = 42 and f(B) = 42 then f(A union B) = 42. At this moment you may be tempted to say “okay, obviously the function returns a constant 42 no matter what”. But actually, all we know is that it returns 42 for finite sets; for infinite sets it could be anything.