As it happens, I am currently in “somebody would have noticed” territory. About a week ago I abruptly switched to believing that Russell’s paradox doesn’t actually prove anything, and that good old naive set theory with a “set of all sets” can be made to work without contradictions. (It does seem to require a weird notion of equality for self-referring sets instead of the usual extensionality, but not much more.) Sorry to say, my math education hasn’t yet helped me snap out of crackpot mode, so if anybody here could help me I’d much appreciate it.
I am seeing substantial amounts of both sense and nonsense in this thread. I suggest that anyone who wants to talk about set theory first learn what it is.
I assume you’re talking about Peter Aczel’s antifoundation axiom (because you mentioned bisimulation); that doesn’t allow a set of all sets (barring inconsistencies, and that particular system can’t be inconsistent unless ordinary set theory is). The same applies to other similar systems. Russell’s paradox isn’t dependent on foundation in any way; as long as you have a set of all sets and the ability to take subsets specified by properties, you get Russell’s paradox.
Edit: Since people seem to be asking about how this works in general, I should just point you all to Aczel’s book on this and other antifoundational set theories, which you can find at http://standish.stanford.edu/pdf/00000056.pdf
as long as you have a set of all sets and the ability to take subsets specified by properties, you get Russell’s paradox.
Yes, that’s true. What I have in mind is restricting the latter ability a bit, by the minimum amount required to get rid of paradoxes. Except if you squint at it the right way, it won’t even look like a restriction :-)
I will use the words “set” and “predicate” interchangeably. A predicate is a function that returns True or False. (Of course it doesn’t have to be Turing-computable or anything.) It’s pretty clear that some predicates exist, e.g. the predicate that always returns False (the empty set) or the one that always returns True (the set of all sets). This seems like a tiny change of terminology, but to me it seems enough to banish Russell’s paradox!
Let’s see how it works. We try to define the Russell predicate R thusly:
R(X) = not(X(X))
...and fail. This definition is incomplete. The value of R isn’t defined on all predicates, because we haven’t specified R(R) and can’t compute it from the definition. If we additionally specify R(R) to be True or False, the paradox goes away.
To make this a little more precise: I think naive set theory can be made to work by disallowing predicates, like the Russell predicate, that are “incompletely defined” in the above sense. In this new theory we will have “AFA-like” non-well-founded sets (e.g. the Quine atom Q={Q}), and so we will need to define equality through bisimilarity. And that’s pretty much all.
As you can see, this is really basic stuff. There’s got to be some big idiotic mistake in my thinking—some kind of contradiction in this new notion of “set”—but I haven’t found it yet.
EDITED on May 13 2010: I’ve found a contradiction. You can safely disregard my theory.
Yes, that’s true. What I have in mind is restricting the latter ability a bit, by >the minimum amount required to get rid of paradoxes.
Well, others have had this same idea. The standard example of a set theory built along those lines is Quine’s “New Foundations” or “NF”.
Now, Russell’s paradox arises when we try to work within a set theory that allows ‘unrestricted class comprehension’. That means that for any predicate P expressed in the language of set theory, there exists a set whose elements are all and only the sets with property P, which we denote {x : P(x) }
In ZF we restrict class comprehension by only assuming the existence of things of the form { x in Y : P(x)} and { f(x) : x in Y } (these correspond respectively to the Axiom of Separation and the Axiom of Replacement ).
On the other hand, in NF we grant existence to anything of the form { x : P(x) } as long as P is what’s called a “stratified” predicate. To say a predicate is stratified is to say that one can assign integer-valued “levels” to the variables in such a way that for any subexpression of the form “x is in y” y’s level has to be one greater than x’s level.
Then clearly the predicate “P(x) iff x is in x” fails to be stratified (because x’s level can’t be one greater than itself). However, the predicate “P(x) iff x = x” is obviously stratified, and {x : x = x} is the set of all sets.
I know New Foundations, but stratification is too strong a restriction for my needs. This weird set theory of mine actually arose from a practical application—modeling “metastrategies” in the Prisoner’s Dilemma. See this thread on decision-theory-workshop.
Let’s see how it works. We try to define the Russell predicate R thusly:
R(X) = not(X(X))
...and fail. This definition is incomplete. The value of R isn’t defined on all predicates, because we haven’t specified R(R) and can’t compute it from the definition. If we additionally specify R(R) to be True or False, the paradox goes away.
How is it that the paradox “goes away”? If you “additionally specify R(R) to be True or False”, don’t you just go down one or the other of the two cases in Russell’s paradox?
Suppose we decide to specify that R(R) is true. Then, by your definition, not(R(R)) is true. That means that R(R) is false, contrary to our specification. Similarly, if we instead specify that R(R) is false, we are led to conclude that R(R) is true, again contradicting our specification.
The conclusion is that we can’t specify any truth value for R(R). Either truth value leads to a contradiction, so R(R) must be left undefined. Is that what you mean to say?
Suppose we decide to specify that R(R) is true. Then, by your definition not(R(R)) is true.
No, in this case R(X) = not(X(X)) for all X distinct from R, and additionally R(R) is true. This is a perfectly fine, completely defined, non-self-contradictory predicate.
Okay, I see. I see nothing obviously contradictory with this.
From a technical standpoint, the hard part would be to give a useful criterion for when a seemingly-well-formed string does or does not completely define a predicate. The string not(X(X)) seems to be well-formed, but you’re saying that actually it’s just a fragment of a predicate, because you need to add “for X not equal to this predicate”, and then give an addition clause about whether this predicate satisfies itself, to have a completely-defined predicate.
I guess that this was the sort of work that was done in these non-foundational systems that people are talking about.
I guess that this was the sort of work that was done in these non-foundational systems that people are talking about.
No, AFA and similar systems are different. They have no “set of all sets” and still make you construct sets up from their parts, but they give you more parts to play with: e.g. explicitly convert a directed graph with cycles into a set that contains itself.
I didn’t mean that what you propose to do is commensurate with those systems. I just meant that those systems might have addressed the technical issue that I pointed out, but it’s not yet clear to me how you address this issue.
I can’t say anything about this specific construction, but there is a related issue in Turing machines. The issue was whether you could determine a useful subset S of the set of all Turing machines, such that the halting problem is solveable for all machines in S, and S was general enough to contain useful examples.
If I remember correctly, the answer was that you couldn’t. This feels a lot like that—I’d bet that the only way of being sure that we can avoid Russel’s paradox is to restrict predicates to such a narrow category that we can’t do much anything useful with them.
I think you are going to run into serious problems. Consider the predicate that always returns true. Then if I’m following Russell’s original formulation of the paradox involving the powerset of the set of all sets will still lead to a contradiction.
Original form of Russell’s paradox: Let A be the set of all sets and let P(A) be the powerset of A. By Cantor, |P(A)| > |A|. But, P(A) is a subset of A, so |P(A)|<=|A|. That’s a contradiction.
Cantor’s theorem breaks down in my system when applied to the set of all sets, because its proof essentially relies on Russell’s paradox to reach the contradiction.
Hmm, that almost seems to be cutting off the nose to spite the cliche. Cantor’s construction is a very natural construction. A set theory where you can’t prove that would be seen by many as unacceptably weak. I’m a bit fuzzy on the details of your system, but let me ask, can you prove in this system that there’s any uncountable set at all? For example, can we prove |R| > |N| ?
Yes. The proof that |R| > |N| stays working because predicates over N aren’t themselves members of N, so the issue of “complete definedness” doesn’t come up.
Hmm, this make work then and not kill off too much of set theory. You may want to talk to a professional set theorist or logician about this (my specialty is number theory so all I can do is glance at this and say that it looks plausible). The only remaining issue then becomes that I’m not sure that this is inherently better than standard set theory. In particular, this approach seems much less counterintuitive than ZFC. But that may be due to the fact that I’m more used to working with ZF-like objects.
Ok, I’ve read up on Cantor’s theorem now, and I think the trick is in the types of A and P(A), and the solution to the paradox is to borrow a trick from type theory. A is defined as the set of all sets, so the obvious question is, sets of what key type? Let that key type be t. Then
A: t=>bool
P(A): (t=>bool)=>bool
We defined P(A) to be in A, so a t=>bool is also a t. Let all other possible types for t be T. t=(t=>bool)+T. Now, one common way to deal with recursive types like this is to treat them as the limit of a sequence of types:
t[i] = t[i-1]=>bool + T
A[i]: t[i]=>bool
P(A[i]) = A[i+1]
Then when we take the limit,
t = lim i->inf t[i]
A = lim i->inf A[i]
P(A) = lim i->inf P(A[i])
Then suddenly, paradoxes based on the cardinality of A and P(A) go away, because those cardinalities diverge!
I’m not sure I know enough about type theory to evaluate this. Although I do know that Russell’s original attempts to repair the defect involved type theory (Principia Mathematica uses a form of type theory however in that form one still can’t form the set of all sets). I don’t think the above works but I don’t quite see what’s wrong with it. Maybe Sniffnoy or someone else more versed in these matters can comment.
I don’t know anything about type theory; when I wrote that I heard it has philosophical problems when applied to set theory, I meant I heard that from you. What the problems might actually be was my own guess...
I’m not deeply familiar with set theory, but cousin_it’s formulation looks valid to me. Isn’t the powerset of the set of all sets just the set of all sets of sets? (Or equivalently, the predicate X=>Y=>Z=>true.) How would you use that to reconstruct the paradox in a way that couldn’t be resolved in the same way?
The powerset of the set of all sets may or may not be the set of all sets (it depends on whether or not you accept atoms in your version of set theory). However, Cantor’s theorem shows that for any set B, the power set of B has cardinality strictly larger than B. So if B=P(B) you’ve got a problem.
(It does seem to require a weird notion of equality for self-referring sets instead of the usual extensionality, but not much more.)
If you are talking about things that are set-like, except that they don’t satisfy the extensionality axiom, then you just aren’t talking about sets. The things you’re talking about may be set-like in some respect, but they aren’t sets.
There are other set-like things that don’t satisfy extensionality. For example, two different properties or predicates might have the same extension.
To be clear—Aczel’s ZFA and similar systems do satisfy extensionality; they’d hardly be set theories if they didn’t. It’s just that when you have sets A and B such that A={A} and B={B}, you’re going to need stronger tools than extensionality to determine whether they are equal.
Interesting. I’m not familiar with Aczel’s system. But is that what cousin_it is talking about doing? That looks like an adjustment to Foundation rather than to Extensionality.
It’s both at once. (Though, as I said, you don’t throw out extensionality. Actually, that raises an interesting question—could you discard extensionality as an axiom, and just derive it from AFA? I hadn’t considered that possibility. Edit: You probably could, there’s no obvious reason why you couldn’t, but I honestly don’t feel like checking the details...)
If you just throw out foundation without putting in anything to replace it, you have the possibility of ill-founded sets, but no way to actually construct any. But the thing is, if all you do is say “Non-well-founded sets exist!” without giving any way to actually work with them, then, well, that’s not very helpful either. Hence any antifoundational replacement for foundation is going to have to strengthen extensionality if you want the result to be something you want to work with at all.
I think you mean to say is “non-Well-founded sets exist!” since you are talking about the antifoundational case (and even with strong anti-foundation axioms I still have well-founded sets to play with also).
How do you mean bisimulation in this case? This seems to be a reduction down to decidable predicates, e.g. a Turing machine for each set. Without a type theory, many obvious algorithms will fail to converge.
I’d like to hear more about this. It doesn’t sound necessarily crackpottish to me to come up with an alternate set theory: von Neumann and Godel did. How do you avoid contradictions?
Wait, how is NBG set theory relevant to this? NBG is just a conservative extension of ZFC, and only allows something resembling a set of all sets by insisting that this collection is not, in fact, a set. Which after all, it has to in order to avoid Russell’s paradox.
Well I mean, I imagine it shouldn’t be too hard to take ZFA (or similar) and tack proper classes onto it. Logic is not really my thing so I’m not actually familiar with how you show that NBG conservatively extends ZFC. The result would be a bit odd, though, in that classes would act very differently from sets—well, OK, more differently than they already do in NBG...
I don’t know the proof either. The other weird thing to note is that even though NBG is a conservative extension of ZFC, some proofs in NBG are much shorter than proofs in ZFC. So in some sense it is only weakly conservative. I don’t know if that notion can be made at all more precise.
Edit: Followup thought, most interesting conservative extensions are only weakly conservative in some sense. Consider for example finite degree field extensions of Q. If axiomatized these become conservative extensions of Z. (That’s essentially why for example we can prove something in the Gaussian integers and know there’s a proof in Z).
Isn’t “the set of all sets” (SAS) ill-defined? Suppose we consider it to be for some set A (maybe the set of all atoms) the infinite regression of power sets SAS = P(P(P(P....(A)))...)
In which case SAS = P(SAS) by Cantor-like arguments?
Russell’s paradox, as usually stated, doesn’t actually prove anything, because it’s usually given as a statement in English about set theory.
I don’t know whether Russell originally stated it in mathematical terms, in which case it would prove something. I’ve read numerous accounts of it, yet never seen a mathematical presentation. Google fails me at present.
I don’t count a statement of the form “x such that x is not a member of x” as mathematical, because my intuition doesn’t want me to talk about sets being members of themselves unless I have a mathematical formalism for sets and set membership for which that works. It’s also not happy about the quantification of x in that sentence; it’s a recursive quantification. Let’s put it this way: Any computer program I have ever written to handle quantification would crash or loop forever if you tried to give it such a statement.
By the way, quick history of Russell’s paradox and related matters, with possible application to the original topic. :)
Russell first pointed out his namesake paradox in Frege’s attempt to axiomatize set theory. So yes, it was a mathematical statement, and, really, it’s pretty simple to state it. Why nobody noticed before this paradox before then, I have no idea, but it does seem to be worth noting that nobody noticed it until someone actually attempted to sit down and actually formalize set theory.
However, Russell was not the first to notice a paradox in naïve set theory. (Not sure to what extent you can talk about paradoxes in something that hasn’t been formalized, but it’s pretty clear what’s meant, I think.) Cesare Burali-Forti noticed earlier that considering the set of all ordinals leads to a paradox. And yet, despite this, people still continued using naïve set theory until Russell! Part of this may have been that, IIRC, Burali-Forti was convinced that what he found could not actually be a paradox, even though, well, in math, a paradox is always a paradox unless you can knock out one of the suppositions. I have to wonder if perhaps his reaction on finding this may have been along the lines of ”...but somebody would have noticed”. :)
And note also that even Russell’s paradox was not phrased originally in this way. His original phrasing as I understand it rested on taking the set of all sets A and then looking at the cardinality of that set’s powerset P(A). Then we have |P(A)| > |A| but P(A) ⇐ A so |P(A)| ⇐ |A| which is a contradiction. This is essentially the same as Russell’s paradox when one expands out the details (particularly, the details in the proof that in general a set has cardinality strictly less than its powerset).
I don’t count a statement of the form “x such that x is not a member of x” as mathematical, because my intuition doesn’t want me to talk about sets being members of themselves unless I have a mathematical formalism for sets and set membership for which that works.
The problem is, how do you exclude it from working? If you’re just working in first-order logic and you’ve got a “membership” predicate, of course it’s a valid sentence. Russell and Whitehead tried to exclude it from working with a theory of types, but, I hear that has philosophical problems. (I admit to not having read their actual axioms or justification for such. I imagine the problem is basically as follows—it’s easy enough to be clear about what you mean for finite types, but how do you specify transfinite types in a way that isn’t circular?) The modern solution with ZFC is not to bar such statements; with the axiom of foundation, such statements are perfectly sensible, they’re just false. Replace it with an antifoundational axiom and they won’t always be false. However, in either case—or without picking a case at all—Russell’s paradox still holds; allowing there to be sets that are members of themselves, does not allow there to be a set of all such sets. That is always paradoxical.
It’s also not happy about the quantification of x in that sentence; it’s a recursive quantification. Let’s put it this way: Any computer program I have ever written to handle quantification would crash or loop forever if you tried to give it such a statement.
It’s not recursive unless you’re already working from a framework where there are objects and sets of objects and sets of etc. In the framework of first-order logic, there are just sets, period. Quantification is over the entire set-theoretic universe. No recursion, just universality. In ZFC sets can indeed be classified into this hierarchy, but that’s a consequence of the axiom of foundation, not a feature of the logic.
I prefer to not have either foundation or an anti-foundational axiom. (Foundation generally leads to a more intuitive universe with sets sort of being like boxes but anti-foundational axioms lead to more interesting systems).
I’m also confused by cousin it’s claim. I don’t see how bisimulation helps one deal with Russell’s paradox but I’d be interested in seeing a sketch of an attempt. As I understand it, if you try to use a notion of bisimilarity rather than extensionality and apply Russell’s Paradox, you end up with essentially a set that isn’t bisimilar to itself. Which is bad.
It’s not recursive unless you’re already working from a framework where there are objects and sets of objects and sets of etc.
I’m pretty sure it’s recursive under any framework. It asks about “x (now a free, unquantified variable) such that x (still a completely free variable) is not a member of x (whoops!)”.
You need to first fill in the value of the third x—making x no longer free—before you can find the instantiations of x. Thus, x needs to be both bound and free at the same time. It makes no sense.
The value of x is being determined by the predication not(member(x,x)); and it’s at the same time being determined by some definition of the set x; and you need to fix both of these things in order to evaluate the expression. You have 2 degrees of freedom, and only one variable. It can’t be done; the expression can’t be unambiguously evaluated. It’s a mathematical version of whack-a-mole.
OK, so what you’re saying is that any predicate that uses a variable in multiple places is recursive. That seems a very unusual notion of “recursive”. Such situations are very common and not at all problematic. What if you’re working with digraphs, with the possibility of self-loops? You can’t handle talking about which nodes do or don’t have edges pointing to themselves? That’s fundamentally all that’s going on here. Not being able to handle that, not being able to look at diagonal conditions, seems pretty broken. Not to mention that any predicate that is a conjuction or disjunction may well include the same variable in multiple places!
It may be possible to make what he is doing more precise or more usable. If he’s objecting to impredicativity then maybe’s he be ok with ZFC or NBG but not KM? (I know that KM is proper with respect to NBG but don’t know much else about it. I’m not sure how relevant this is.)
I’m pretty sure it’s recursive under any framework. It asks about “x (now a free, unquantified variable) such that x (still a completely free variable) is not a member of x (whoops!)”.
Your assertions about when x is free aren’t right. The variable x is being bound in its first appearance, and it is already bound (and so not “completely free”) in its second and third appearance.
And, to restate what Sniffnoy is saying in terms of Russell’s own example: Are you really not allowed to talk about all the barbers who shave themselves?
Your assertions about when x is free aren’t right. The variable x is being bound in its first appearance, and it is already bound (and so not “completely free”) in its second and third appearance.
The third appearance of x assumes that x is already bound, so that you can evaluate whether something is a member of it. But when processing the second appearance of x, you are testing candidate values of x, to find out whether they satisfy the predicate not(member(X,X)). Within that predicate, you assume that the second argument is already bound, and look for instantiations of the first argument for which the predicate is true.
To evaluate this, you would have to write it as
Find X1, X2, X3 such that not(member(X1,X2)), equals(X1,X2), member(X2, X3), equals(X2, X3). (Finding a binding asserts that the set of all sets that are not members of themselves is a member of itself.)
But if you try to evaluate this, it’s not going to run into any paradoxes; it’s going to depend on how you define the predicate “not”.
You can see a problem in writing it down if you think about how, computationally, it would be evaluated. If you start by asking for the set X,Y such that not(member(X,Y)), that’s not a computable set; it keeps generating members forever. I don’t think there’s any way to write Russell’s paradox in Prolog such that it would be possible to compute an answer.
Your assertions about when x is free aren’t right. The variable x is being bound in its first appearance, and it is already bound (and so not “completely free”) in its second and third appearance.
The third appearance of x assumes that x is already bound, so that you can evaluate whether something is a member of it. But when processing the second appearance of x, you are testing candidate values of x, to find out whether they satisfy the predicate not(member(X,X)). Within that predicate, you assume that the second argument is already bound, and look for instantiations of the first argument for which the predicate is true.
You are not using the word “bound” in the way that it is used in formal logic. The two occurances of ‘X’ in not(member(X,X)) are either both bound or both free. You cannot have one lying within the scope of a quantifier while the other is not.
You aren’t disagreeing with me! I’m just additionally pointing out that the semantics of member(X,Y) require either X or Y to be bound in order for the statement to be unified, and for the entire quantified statement to be meaningful. To find X that satisfy Russell’s paradox, you would have to violate the rule you just stated.
To find X that satisfy Russell’s paradox, you would have to violate the rule you just stated.
Do you mean that I can’t verify that a set is not an element of itself? But the empty set, for example, is not an element of itself. I can confirm this by noting that no thing is an element of the empty set (which is a finite check). Hence, the empty set, in particular, is not an element of the empty set. Therefore, by setting X = the empty set, I have an X such that X is not an element of X.
Note that in the above use of the predicate not(member(X,X)), both occurrences of ‘X’ were bound. So I’m not sure what you mean when you say that I “would have to violate the rule [I] just stated.”
Oops, you’re right. I was thinking of cases where you are trying to generate answers to a query, which is different from testing one candidate answer.
Let me start over:
My suspicion, which I haven’t proved, is that Russell’s Paradox is a result of specifying a logic, and not specifying the algorithms used to evaluate statements in that logic or to answer queries. In any complete specification of a logic, including both the rules and the computational steps taken to evaluate or query-answer, you will probably find that you can prove what the outcome of the Russell’s paradox query is.
In other words, “formal logic” is too sloppy; my intuition is that in Russell’s paradox, it hides problems with order of execution.
Not really. In most standard axiomatic descriptions of set theory, like say ZFC, one doesn’t need to think at all about computability of set relations. This is a good thing. We want for example the Busy Beaver function to be well-defined. And we want to be able to talk about things like the behavior of Turing machines when connected to non-computable oracles. If we insist that our set theoretic relationships be computable this sort of thing becomes very hard to talk about.
Huh? The Busy Beaver function is all about computability. The function itself is not computable; but you know that only because you’ve spelled out how the computation works, and can show that there is an answer but that it is not computable.
ZFC was designed to avoid Russell’s Paradox; it does this by getting closer to a computable description of set theory. The definition itself is actually computational; a set of axioms of implication are just rewrite rules. It is ambiguous only in that it fails to specify what order rewrite rule applies first.
I have no problem with using non-computable oracles. But the way Turing machines interact with those oracles is computationally explicit. The view I’m expressing suggests you could run into troubles if you posed an oracle a question that hid ambiguities that would be resolved by a computational specification of the oracle. But I’ve never seen any use of an oracle that did that. People don’t use oracles for Russell’s paradox-like problems; they typically use them for non-computable functions.
How are you defining computable? Naively I would think it would be something like describable by a Turing machine that outputs the result after finite time. But Busy Beaver won’t do that. There may be definitional issues at work here.
You aren’t disagreeing with me! I’m just additionally pointing out that the semantics of member(X,Y) require either X or Y to be bound in order for the statement to be unified, and for the entire quantified statement to be meaningful.
I’m not sure how you’re using these words. In formal logic, boundedness has nothing to do with semantics. Boundedness is a purely syntactic notion. And I don’t know what “unified” means in this context.
The “member” predicate is not a predicate like “shaves”; it is a primitive predicate, like “and”, with special semantics. If you want to generate the set of X for which member(X,Y) is true, you need Y to be bound. To generate the set of Y for which member(X,Y) is true, you need X to be bound. That’s what I meant.
Asking for the set of all for which member(X,Y) is true is going to get stuck in an infinite loop without ever generating any results. There is some connection between Russell’s paradox, and queries that generate a useless infinite loop in a query language. You can look at a statement of the paradox and feel that it leads to two contradictory proofs; but if you define set theory and the algorithms used to answer queries, I think (but am not sure) you will find that you can prove what the algorithm will produce when run.
Thanks for the pointer. I was saying that my guess is that Russell’s paradox reduces to something like the Kleene-Rosser paradox, which is resolved when you spell out the order of execution; and that we shouldn’t worry about paradoxes that are resolved when you spell out the order of execution, because a “formal logic” without the details for performing computations spelled out is not sufficiently formal.
The “member” predicate is not a predicate like “shaves”; it is a primitive predicate, like “and”, with special semantics. If you want to generate the set of X for which member(X,Y) is true, you need Y to be bound. To generate the set of Y for which member(X,Y) is true, you need X to be bound. That’s what I meant.
That’s not a problem, because, like I said, both arguments of member(X,X) are bound in Russell’s paradox. They are both bound by the phrase “The set of all X such that. . .” that precedes them.
Asking for the set of all for which member(X,Y) is true is going to get stuck in an infinite loop without ever generating any results.
Again, I need to know how you are encoding sets before I can answer any question about what an algorithm would do on given input.
but if you define set theory and the algorithms used to answer queries, I think (but am not sure) you will find that you can prove what the algorithm will produce when run.
My talk of “primitive predicates” makes sense only if you’re talking about a computational implementation.
In a truly formal logic, you do determine the definitions of the predicates purely by the axioms they are found in, including a set of axioms that are production rules, which therefore spell out execution (although order of execution is still usually ambiguous). ZF set theory is like that.
But I’ve never seen a presentation of Russell’s paradox which defined what “not” and “member-of” mean in an axiomatic way. I don’t think that had even been done for set theory at the time Russell proposed his paradox. The presentations I’ve seen always rely on your intuition about what “not” and “member of” mean.
The predicate shaves(X,X) is very different from the predicate member-of(X,X).
To find values of X that satisfy the first, you simply enumerate all of your facts about
shaves(al, fred)
shaves(al, joe)
shaves(al, al)
and look for one that unifies with shaves(X,X).
You aren’t going to find the X that satisfy member(X,X) by unification. It isn’t even ever true for finite sets, except by convention for the empty set.
It isn’t even ever true for finite sets, except by convention for the empty set.
I think that you’re confusing the element-of relation with the subset-of relation. Or something. But then, all sets are subsets of themselves, including finite ones, so I’m not sure what you were thinking.
At any rate, the empty set is not an element of itself according to any convention that I’ve ever seen.
You aren’t going to find the X that satisfy member(X,X) by unification.
I’m not sure how to respond to this until you answer my question in this comment.
Russell’s paradox, as usually stated, doesn’t actually prove anything, because it’s
usually given as a statement in English about set theory.
It is presented that way to make a point that naive set theory isn’t workable.
I don’t know whether Russell originally stated it in mathematical terms, in which
case it would prove something. I’ve read numerous accounts of it, yet never seen > a mathematical presentation. Google fails me at present
It is presented rigorously in most intro set theory text books. In ZFC or any other standard set of set theory axioms, Russell’s paradox ceases to be a paradox and the logic is instead a theorem of the form “For any set A, there exists a set B such B is not an element of A.”
I don’t count a statement of the form “x such that x is not a member of x” as
mathematical, because my intuition doesn’t want me to talk about sets being
members of themselves unless I have a mathematical formalism for sets and set > membership for which that works. It’s also not happy about the quantification of x
in that sentence; it’s a recursive quantification. Let’s put it this way: Any computer
program I have ever written to handle quantification would crash or loop forever if
you tried to give it such a statement.
Well, a standard formalism (again such as ZFC) is perfectly happy talking about sets that recur on themselves this way. Indeed, it is actually difficult to make a system of set theory that doesn’t allow you to at least talk about sets like this.
I’m curious, do you consider Cantor’s diagnolization argument to be too recursive? What about Turing’s Halting Theorem?
Let’s put it this way: Any computer program I have ever written to handle quantification would crash or loop forever if you tried to give it such a statement.
What encoding scheme would you use to encode arbitrary, possibly infinite, sets in a computer?
As it happens, I am currently in “somebody would have noticed” territory. About a week ago I abruptly switched to believing that Russell’s paradox doesn’t actually prove anything, and that good old naive set theory with a “set of all sets” can be made to work without contradictions. (It does seem to require a weird notion of equality for self-referring sets instead of the usual extensionality, but not much more.) Sorry to say, my math education hasn’t yet helped me snap out of crackpot mode, so if anybody here could help me I’d much appreciate it.
I am seeing substantial amounts of both sense and nonsense in this thread. I suggest that anyone who wants to talk about set theory first learn what it is.
The Wikipedia article is somewhat wordy (i.e. made of words, rather than mathematics), and Mathworld is unusably fragmented. The Stanford Encyclopedia is good, but for anyone seriously interested I would suggest a book such as Devlin’s “The Joy of Sets”.
I assume you’re talking about Peter Aczel’s antifoundation axiom (because you mentioned bisimulation); that doesn’t allow a set of all sets (barring inconsistencies, and that particular system can’t be inconsistent unless ordinary set theory is). The same applies to other similar systems. Russell’s paradox isn’t dependent on foundation in any way; as long as you have a set of all sets and the ability to take subsets specified by properties, you get Russell’s paradox.
Edit: Since people seem to be asking about how this works in general, I should just point you all to Aczel’s book on this and other antifoundational set theories, which you can find at http://standish.stanford.edu/pdf/00000056.pdf
Yes, that’s true. What I have in mind is restricting the latter ability a bit, by the minimum amount required to get rid of paradoxes. Except if you squint at it the right way, it won’t even look like a restriction :-)
I will use the words “set” and “predicate” interchangeably. A predicate is a function that returns True or False. (Of course it doesn’t have to be Turing-computable or anything.) It’s pretty clear that some predicates exist, e.g. the predicate that always returns False (the empty set) or the one that always returns True (the set of all sets). This seems like a tiny change of terminology, but to me it seems enough to banish Russell’s paradox!
Let’s see how it works. We try to define the Russell predicate R thusly:
R(X) = not(X(X))
...and fail. This definition is incomplete. The value of R isn’t defined on all predicates, because we haven’t specified R(R) and can’t compute it from the definition. If we additionally specify R(R) to be True or False, the paradox goes away.
To make this a little more precise: I think naive set theory can be made to work by disallowing predicates, like the Russell predicate, that are “incompletely defined” in the above sense. In this new theory we will have “AFA-like” non-well-founded sets (e.g. the Quine atom Q={Q}), and so we will need to define equality through bisimilarity. And that’s pretty much all.
As you can see, this is really basic stuff. There’s got to be some big idiotic mistake in my thinking—some kind of contradiction in this new notion of “set”—but I haven’t found it yet.
EDITED on May 13 2010: I’ve found a contradiction. You can safely disregard my theory.
Well, others have had this same idea. The standard example of a set theory built along those lines is Quine’s “New Foundations” or “NF”.
Now, Russell’s paradox arises when we try to work within a set theory that allows ‘unrestricted class comprehension’. That means that for any predicate P expressed in the language of set theory, there exists a set whose elements are all and only the sets with property P, which we denote {x : P(x) }
In ZF we restrict class comprehension by only assuming the existence of things of the form { x in Y : P(x)} and { f(x) : x in Y } (these correspond respectively to the Axiom of Separation and the Axiom of Replacement ).
On the other hand, in NF we grant existence to anything of the form { x : P(x) } as long as P is what’s called a “stratified” predicate. To say a predicate is stratified is to say that one can assign integer-valued “levels” to the variables in such a way that for any subexpression of the form “x is in y” y’s level has to be one greater than x’s level.
Then clearly the predicate “P(x) iff x is in x” fails to be stratified (because x’s level can’t be one greater than itself). However, the predicate “P(x) iff x = x” is obviously stratified, and {x : x = x} is the set of all sets.
I know New Foundations, but stratification is too strong a restriction for my needs. This weird set theory of mine actually arose from a practical application—modeling “metastrategies” in the Prisoner’s Dilemma. See this thread on decision-theory-workshop.
How is it that the paradox “goes away”? If you “additionally specify R(R) to be True or False”, don’t you just go down one or the other of the two cases in Russell’s paradox?
Suppose we decide to specify that R(R) is true. Then, by your definition, not(R(R)) is true. That means that R(R) is false, contrary to our specification. Similarly, if we instead specify that R(R) is false, we are led to conclude that R(R) is true, again contradicting our specification.
The conclusion is that we can’t specify any truth value for R(R). Either truth value leads to a contradiction, so R(R) must be left undefined. Is that what you mean to say?
No, in this case R(X) = not(X(X)) for all X distinct from R, and additionally R(R) is true. This is a perfectly fine, completely defined, non-self-contradictory predicate.
Why is R(X) = not(X(X)) only for R =/= X? In Russell’s version, X should vary over all predicates/sets, meaning when instance X with R, we get
R(R) = ¬R(R)
as per the paradox.
Not sure what your objection is. I introduced the notion of “incompletely defined predicate” to do away with Russell’s version of the predicate.
Okay, I see. I see nothing obviously contradictory with this.
From a technical standpoint, the hard part would be to give a useful criterion for when a seemingly-well-formed string does or does not completely define a predicate. The string not(X(X)) seems to be well-formed, but you’re saying that actually it’s just a fragment of a predicate, because you need to add “for X not equal to this predicate”, and then give an addition clause about whether this predicate satisfies itself, to have a completely-defined predicate.
I guess that this was the sort of work that was done in these non-foundational systems that people are talking about.
No, AFA and similar systems are different. They have no “set of all sets” and still make you construct sets up from their parts, but they give you more parts to play with: e.g. explicitly convert a directed graph with cycles into a set that contains itself.
I didn’t mean that what you propose to do is commensurate with those systems. I just meant that those systems might have addressed the technical issue that I pointed out, but it’s not yet clear to me how you address this issue.
I can’t say anything about this specific construction, but there is a related issue in Turing machines. The issue was whether you could determine a useful subset S of the set of all Turing machines, such that the halting problem is solveable for all machines in S, and S was general enough to contain useful examples.
If I remember correctly, the answer was that you couldn’t. This feels a lot like that—I’d bet that the only way of being sure that we can avoid Russel’s paradox is to restrict predicates to such a narrow category that we can’t do much anything useful with them.
I think you are going to run into serious problems. Consider the predicate that always returns true. Then if I’m following Russell’s original formulation of the paradox involving the powerset of the set of all sets will still lead to a contradiction.
I can’t seem to work out for myself what you mean. Can you spell it out in more detail?
Original form of Russell’s paradox: Let A be the set of all sets and let P(A) be the powerset of A. By Cantor, |P(A)| > |A|. But, P(A) is a subset of A, so |P(A)|<=|A|. That’s a contradiction.
Cantor’s theorem breaks down in my system when applied to the set of all sets, because its proof essentially relies on Russell’s paradox to reach the contradiction.
Hmm, that almost seems to be cutting off the nose to spite the cliche. Cantor’s construction is a very natural construction. A set theory where you can’t prove that would be seen by many as unacceptably weak. I’m a bit fuzzy on the details of your system, but let me ask, can you prove in this system that there’s any uncountable set at all? For example, can we prove |R| > |N| ?
Yes. The proof that |R| > |N| stays working because predicates over N aren’t themselves members of N, so the issue of “complete definedness” doesn’t come up.
Hmm, this make work then and not kill off too much of set theory. You may want to talk to a professional set theorist or logician about this (my specialty is number theory so all I can do is glance at this and say that it looks plausible). The only remaining issue then becomes that I’m not sure that this is inherently better than standard set theory. In particular, this approach seems much less counterintuitive than ZFC. But that may be due to the fact that I’m more used to working with ZF-like objects.
The original form of Russell’s (Zermelo’s in fact) paradox is not this. The original form is {x|x not member of x}.
That leads to both
x is a member of x
and
x is not a member of x
And that is the original form of the paradox.
No. See for example This discussion. The form you give where it is described as a simple predicate recursion was not the original form of the paradox.
Ok, I’ve read up on Cantor’s theorem now, and I think the trick is in the types of A and P(A), and the solution to the paradox is to borrow a trick from type theory. A is defined as the set of all sets, so the obvious question is, sets of what key type? Let that key type be t. Then
We defined P(A) to be in A, so a t=>bool is also a t. Let all other possible types for t be T. t=(t=>bool)+T. Now, one common way to deal with recursive types like this is to treat them as the limit of a sequence of types:
Then when we take the limit,
Then suddenly, paradoxes based on the cardinality of A and P(A) go away, because those cardinalities diverge!
I’m not sure I know enough about type theory to evaluate this. Although I do know that Russell’s original attempts to repair the defect involved type theory (Principia Mathematica uses a form of type theory however in that form one still can’t form the set of all sets). I don’t think the above works but I don’t quite see what’s wrong with it. Maybe Sniffnoy or someone else more versed in these matters can comment.
I don’t know anything about type theory; when I wrote that I heard it has philosophical problems when applied to set theory, I meant I heard that from you. What the problems might actually be was my own guess...
Huh. Did I say that? I don’t know almost anything about type theory. When did I say that?
I’m not deeply familiar with set theory, but cousin_it’s formulation looks valid to me. Isn’t the powerset of the set of all sets just the set of all sets of sets? (Or equivalently, the predicate X=>Y=>Z=>true.) How would you use that to reconstruct the paradox in a way that couldn’t be resolved in the same way?
The powerset of the set of all sets may or may not be the set of all sets (it depends on whether or not you accept atoms in your version of set theory). However, Cantor’s theorem shows that for any set B, the power set of B has cardinality strictly larger than B. So if B=P(B) you’ve got a problem.
If you are talking about things that are set-like, except that they don’t satisfy the extensionality axiom, then you just aren’t talking about sets. The things you’re talking about may be set-like in some respect, but they aren’t sets.
There are other set-like things that don’t satisfy extensionality. For example, two different properties or predicates might have the same extension.
To be clear—Aczel’s ZFA and similar systems do satisfy extensionality; they’d hardly be set theories if they didn’t. It’s just that when you have sets A and B such that A={A} and B={B}, you’re going to need stronger tools than extensionality to determine whether they are equal.
Interesting. I’m not familiar with Aczel’s system. But is that what cousin_it is talking about doing? That looks like an adjustment to Foundation rather than to Extensionality.
It’s both at once. (Though, as I said, you don’t throw out extensionality. Actually, that raises an interesting question—could you discard extensionality as an axiom, and just derive it from AFA? I hadn’t considered that possibility. Edit: You probably could, there’s no obvious reason why you couldn’t, but I honestly don’t feel like checking the details...)
If you just throw out foundation without putting in anything to replace it, you have the possibility of ill-founded sets, but no way to actually construct any. But the thing is, if all you do is say “Non-well-founded sets exist!” without giving any way to actually work with them, then, well, that’s not very helpful either. Hence any antifoundational replacement for foundation is going to have to strengthen extensionality if you want the result to be something you want to work with at all.
I think you mean to say is “non-Well-founded sets exist!” since you are talking about the antifoundational case (and even with strong anti-foundation axioms I still have well-founded sets to play with also).
Oops. Fixed.
How do you mean bisimulation in this case? This seems to be a reduction down to decidable predicates, e.g. a Turing machine for each set. Without a type theory, many obvious algorithms will fail to converge.
I’d like to hear more about this. It doesn’t sound necessarily crackpottish to me to come up with an alternate set theory: von Neumann and Godel did. How do you avoid contradictions?
Wait, how is NBG set theory relevant to this? NBG is just a conservative extension of ZFC, and only allows something resembling a set of all sets by insisting that this collection is not, in fact, a set. Which after all, it has to in order to avoid Russell’s paradox.
Yes, and I’m guessing cousin_it’s version of set theory is possibly equivalent to something similar. I’d love to hear more about it.
Well I mean, I imagine it shouldn’t be too hard to take ZFA (or similar) and tack proper classes onto it. Logic is not really my thing so I’m not actually familiar with how you show that NBG conservatively extends ZFC. The result would be a bit odd, though, in that classes would act very differently from sets—well, OK, more differently than they already do in NBG...
I don’t know the proof either. The other weird thing to note is that even though NBG is a conservative extension of ZFC, some proofs in NBG are much shorter than proofs in ZFC. So in some sense it is only weakly conservative. I don’t know if that notion can be made at all more precise.
Edit: Followup thought, most interesting conservative extensions are only weakly conservative in some sense. Consider for example finite degree field extensions of Q. If axiomatized these become conservative extensions of Z. (That’s essentially why for example we can prove something in the Gaussian integers and know there’s a proof in Z).
Isn’t “the set of all sets” (SAS) ill-defined? Suppose we consider it to be for some set A (maybe the set of all atoms) the infinite regression of power sets SAS = P(P(P(P....(A)))...)
In which case SAS = P(SAS) by Cantor-like arguments?
And Russell’s paradox goes away?
So, is the set of all sets that aren’t members of themselves, a member of itself, or not?
Every set is also a subset of itself, by definition.
From Wikipedia “By definition a set z is a subset of a set x, if and only if every element of z is also an element of x”
Insufficient data to answer your question :-) See my reply to Sniffnoy.
Russell’s paradox, as usually stated, doesn’t actually prove anything, because it’s usually given as a statement in English about set theory.
I don’t know whether Russell originally stated it in mathematical terms, in which case it would prove something. I’ve read numerous accounts of it, yet never seen a mathematical presentation. Google fails me at present.
I don’t count a statement of the form “x such that x is not a member of x” as mathematical, because my intuition doesn’t want me to talk about sets being members of themselves unless I have a mathematical formalism for sets and set membership for which that works. It’s also not happy about the quantification of x in that sentence; it’s a recursive quantification. Let’s put it this way: Any computer program I have ever written to handle quantification would crash or loop forever if you tried to give it such a statement.
By the way, quick history of Russell’s paradox and related matters, with possible application to the original topic. :)
Russell first pointed out his namesake paradox in Frege’s attempt to axiomatize set theory. So yes, it was a mathematical statement, and, really, it’s pretty simple to state it. Why nobody noticed before this paradox before then, I have no idea, but it does seem to be worth noting that nobody noticed it until someone actually attempted to sit down and actually formalize set theory.
However, Russell was not the first to notice a paradox in naïve set theory. (Not sure to what extent you can talk about paradoxes in something that hasn’t been formalized, but it’s pretty clear what’s meant, I think.) Cesare Burali-Forti noticed earlier that considering the set of all ordinals leads to a paradox. And yet, despite this, people still continued using naïve set theory until Russell! Part of this may have been that, IIRC, Burali-Forti was convinced that what he found could not actually be a paradox, even though, well, in math, a paradox is always a paradox unless you can knock out one of the suppositions. I have to wonder if perhaps his reaction on finding this may have been along the lines of ”...but somebody would have noticed”. :)
And note also that even Russell’s paradox was not phrased originally in this way. His original phrasing as I understand it rested on taking the set of all sets A and then looking at the cardinality of that set’s powerset P(A). Then we have |P(A)| > |A| but P(A) ⇐ A so |P(A)| ⇐ |A| which is a contradiction. This is essentially the same as Russell’s paradox when one expands out the details (particularly, the details in the proof that in general a set has cardinality strictly less than its powerset).
Ah, good point. I’d forgotten about that part. IIRC he first noted that and then expanded out the details to see where the problem was.
The problem is, how do you exclude it from working? If you’re just working in first-order logic and you’ve got a “membership” predicate, of course it’s a valid sentence. Russell and Whitehead tried to exclude it from working with a theory of types, but, I hear that has philosophical problems. (I admit to not having read their actual axioms or justification for such. I imagine the problem is basically as follows—it’s easy enough to be clear about what you mean for finite types, but how do you specify transfinite types in a way that isn’t circular?) The modern solution with ZFC is not to bar such statements; with the axiom of foundation, such statements are perfectly sensible, they’re just false. Replace it with an antifoundational axiom and they won’t always be false. However, in either case—or without picking a case at all—Russell’s paradox still holds; allowing there to be sets that are members of themselves, does not allow there to be a set of all such sets. That is always paradoxical.
It’s not recursive unless you’re already working from a framework where there are objects and sets of objects and sets of etc. In the framework of first-order logic, there are just sets, period. Quantification is over the entire set-theoretic universe. No recursion, just universality. In ZFC sets can indeed be classified into this hierarchy, but that’s a consequence of the axiom of foundation, not a feature of the logic.
I prefer to not have either foundation or an anti-foundational axiom. (Foundation generally leads to a more intuitive universe with sets sort of being like boxes but anti-foundational axioms lead to more interesting systems).
I’m also confused by cousin it’s claim. I don’t see how bisimulation helps one deal with Russell’s paradox but I’d be interested in seeing a sketch of an attempt. As I understand it, if you try to use a notion of bisimilarity rather than extensionality and apply Russell’s Paradox, you end up with essentially a set that isn’t bisimilar to itself. Which is bad.
I’m pretty sure it’s recursive under any framework. It asks about “x (now a free, unquantified variable) such that x (still a completely free variable) is not a member of x (whoops!)”.
You need to first fill in the value of the third x—making x no longer free—before you can find the instantiations of x. Thus, x needs to be both bound and free at the same time. It makes no sense.
The value of x is being determined by the predication not(member(x,x)); and it’s at the same time being determined by some definition of the set x; and you need to fix both of these things in order to evaluate the expression. You have 2 degrees of freedom, and only one variable. It can’t be done; the expression can’t be unambiguously evaluated. It’s a mathematical version of whack-a-mole.
OK, so what you’re saying is that any predicate that uses a variable in multiple places is recursive. That seems a very unusual notion of “recursive”. Such situations are very common and not at all problematic. What if you’re working with digraphs, with the possibility of self-loops? You can’t handle talking about which nodes do or don’t have edges pointing to themselves? That’s fundamentally all that’s going on here. Not being able to handle that, not being able to look at diagonal conditions, seems pretty broken. Not to mention that any predicate that is a conjuction or disjunction may well include the same variable in multiple places!
It may be possible to make what he is doing more precise or more usable. If he’s objecting to impredicativity then maybe’s he be ok with ZFC or NBG but not KM? (I know that KM is proper with respect to NBG but don’t know much else about it. I’m not sure how relevant this is.)
Your assertions about when x is free aren’t right. The variable x is being bound in its first appearance, and it is already bound (and so not “completely free”) in its second and third appearance.
And, to restate what Sniffnoy is saying in terms of Russell’s own example: Are you really not allowed to talk about all the barbers who shave themselves?
The third appearance of x assumes that x is already bound, so that you can evaluate whether something is a member of it. But when processing the second appearance of x, you are testing candidate values of x, to find out whether they satisfy the predicate not(member(X,X)). Within that predicate, you assume that the second argument is already bound, and look for instantiations of the first argument for which the predicate is true.
To evaluate this, you would have to write it as
Find X1, X2, X3 such that not(member(X1,X2)), equals(X1,X2), member(X2, X3), equals(X2, X3). (Finding a binding asserts that the set of all sets that are not members of themselves is a member of itself.)
But if you try to evaluate this, it’s not going to run into any paradoxes; it’s going to depend on how you define the predicate “not”.
You can see a problem in writing it down if you think about how, computationally, it would be evaluated. If you start by asking for the set X,Y such that not(member(X,Y)), that’s not a computable set; it keeps generating members forever. I don’t think there’s any way to write Russell’s paradox in Prolog such that it would be possible to compute an answer.
You are not using the word “bound” in the way that it is used in formal logic. The two occurances of ‘X’ in not(member(X,X)) are either both bound or both free. You cannot have one lying within the scope of a quantifier while the other is not.
You aren’t disagreeing with me! I’m just additionally pointing out that the semantics of member(X,Y) require either X or Y to be bound in order for the statement to be unified, and for the entire quantified statement to be meaningful. To find X that satisfy Russell’s paradox, you would have to violate the rule you just stated.
Do you mean that I can’t verify that a set is not an element of itself? But the empty set, for example, is not an element of itself. I can confirm this by noting that no thing is an element of the empty set (which is a finite check). Hence, the empty set, in particular, is not an element of the empty set. Therefore, by setting X = the empty set, I have an X such that X is not an element of X.
Note that in the above use of the predicate not(member(X,X)), both occurrences of ‘X’ were bound. So I’m not sure what you mean when you say that I “would have to violate the rule [I] just stated.”
Oops, you’re right. I was thinking of cases where you are trying to generate answers to a query, which is different from testing one candidate answer.
Let me start over:
My suspicion, which I haven’t proved, is that Russell’s Paradox is a result of specifying a logic, and not specifying the algorithms used to evaluate statements in that logic or to answer queries. In any complete specification of a logic, including both the rules and the computational steps taken to evaluate or query-answer, you will probably find that you can prove what the outcome of the Russell’s paradox query is.
In other words, “formal logic” is too sloppy; my intuition is that in Russell’s paradox, it hides problems with order of execution.
Not really. In most standard axiomatic descriptions of set theory, like say ZFC, one doesn’t need to think at all about computability of set relations. This is a good thing. We want for example the Busy Beaver function to be well-defined. And we want to be able to talk about things like the behavior of Turing machines when connected to non-computable oracles. If we insist that our set theoretic relationships be computable this sort of thing becomes very hard to talk about.
Huh? The Busy Beaver function is all about computability. The function itself is not computable; but you know that only because you’ve spelled out how the computation works, and can show that there is an answer but that it is not computable.
ZFC was designed to avoid Russell’s Paradox; it does this by getting closer to a computable description of set theory. The definition itself is actually computational; a set of axioms of implication are just rewrite rules. It is ambiguous only in that it fails to specify what order rewrite rule applies first.
I have no problem with using non-computable oracles. But the way Turing machines interact with those oracles is computationally explicit. The view I’m expressing suggests you could run into troubles if you posed an oracle a question that hid ambiguities that would be resolved by a computational specification of the oracle. But I’ve never seen any use of an oracle that did that. People don’t use oracles for Russell’s paradox-like problems; they typically use them for non-computable functions.
How are you defining computable? Naively I would think it would be something like describable by a Turing machine that outputs the result after finite time. But Busy Beaver won’t do that. There may be definitional issues at work here.
The busy beaver function isn’t computable; but it is posed within a well-defined computational model.
Ok. So when do you consider something to be within an acceptable computational model or not?
I’m not sure how you’re using these words. In formal logic, boundedness has nothing to do with semantics. Boundedness is a purely syntactic notion. And I don’t know what “unified” means in this context.
The “member” predicate is not a predicate like “shaves”; it is a primitive predicate, like “and”, with special semantics. If you want to generate the set of X for which member(X,Y) is true, you need Y to be bound. To generate the set of Y for which member(X,Y) is true, you need X to be bound. That’s what I meant.
Asking for the set of all for which member(X,Y) is true is going to get stuck in an infinite loop without ever generating any results. There is some connection between Russell’s paradox, and queries that generate a useless infinite loop in a query language. You can look at a statement of the paradox and feel that it leads to two contradictory proofs; but if you define set theory and the algorithms used to answer queries, I think (but am not sure) you will find that you can prove what the algorithm will produce when run.
It looks like you’re using a computational model to understand Russell’s paradox. Should you be thinking of Kleene-Rosser?
Thanks for the pointer. I was saying that my guess is that Russell’s paradox reduces to something like the Kleene-Rosser paradox, which is resolved when you spell out the order of execution; and that we shouldn’t worry about paradoxes that are resolved when you spell out the order of execution, because a “formal logic” without the details for performing computations spelled out is not sufficiently formal.
That’s not a problem, because, like I said, both arguments of member(X,X) are bound in Russell’s paradox. They are both bound by the phrase “The set of all X such that. . .” that precedes them.
Again, I need to know how you are encoding sets before I can answer any question about what an algorithm would do on given input.
One wouldn’t want all queries within set theory to be decidable by some algorithm. If that were the case, then set theory would not be able to capture enough of arithmetic.
I’m aware of that. This particular paradox, though, is one where the question seems to be decidable both ways, not one where it’s undecidable.
My talk of “primitive predicates” makes sense only if you’re talking about a computational implementation.
In a truly formal logic, you do determine the definitions of the predicates purely by the axioms they are found in, including a set of axioms that are production rules, which therefore spell out execution (although order of execution is still usually ambiguous). ZF set theory is like that.
But I’ve never seen a presentation of Russell’s paradox which defined what “not” and “member-of” mean in an axiomatic way. I don’t think that had even been done for set theory at the time Russell proposed his paradox. The presentations I’ve seen always rely on your intuition about what “not” and “member of” mean.
The predicate shaves(X,X) is very different from the predicate member-of(X,X). To find values of X that satisfy the first, you simply enumerate all of your facts about
shaves(al, fred) shaves(al, joe) shaves(al, al)
and look for one that unifies with shaves(X,X).
You aren’t going to find the X that satisfy member(X,X) by unification. It isn’t even ever true for finite sets, except by convention for the empty set.
I think that you’re confusing the element-of relation with the subset-of relation. Or something. But then, all sets are subsets of themselves, including finite ones, so I’m not sure what you were thinking.
At any rate, the empty set is not an element of itself according to any convention that I’ve ever seen.
I’m not sure how to respond to this until you answer my question in this comment.
You’re right; I was mis-remembering. It can’t be, or 1 := 0 u {0} wouldn’t work.
It is presented that way to make a point that naive set theory isn’t workable.
It is presented rigorously in most intro set theory text books. In ZFC or any other standard set of set theory axioms, Russell’s paradox ceases to be a paradox and the logic is instead a theorem of the form “For any set A, there exists a set B such B is not an element of A.”
Well, a standard formalism (again such as ZFC) is perfectly happy talking about sets that recur on themselves this way. Indeed, it is actually difficult to make a system of set theory that doesn’t allow you to at least talk about sets like this.
I’m curious, do you consider Cantor’s diagnolization argument to be too recursive? What about Turing’s Halting Theorem?
What encoding scheme would you use to encode arbitrary, possibly infinite, sets in a computer?
I could, worst case, use the encoding scheme you use to write them down on paper when you prove things about them.