I don’t know about category theory, but identity can’t be defined in first-order logic, so it is usually added as a primitive logical predicate to first-order logic. Adding infinite axiom schemas of the sort you showed above don’t qualify as a definition, which has to be a finite biconditional. But identity is definable in pure second-order logic. Using Leibniz’s law, one can define
x=y:↔∀X(X(x)↔X(y))
Intuitively, x and y are defined identical iff they have all their properties in common. Formally, there is always a subset of the domain in the range of X and Y which for any object o contains only o. This guarantees that the definition cannot accidentally identify different objects.
For instance, are the symbols “1” = “2″ in first order logic? Depends on the axioms!
Yes and that is because in predicate logic, different names (constant symbols) are allowed to refer to the same object or to different objects, while the same names always refer to the same object. This is similar to natural language where one thing can have multiple names. Frege (the guy who came up with modern predicate logic) has written about this in an 1892 paper:
Equality gives rise to challenging questions which are not altogether easy to answer. Is it a relation? A relation between objects, or between names or signs of objects? In my Begriffsschrift I assumed the latter. The reasons which seem to favour this are the following: a = a and a = b are obviously statements of differing cognitive value; a = a holds a priori and, according to Kant, is to be labelled analytic, while statements of the form a = b often contain very valuable extensions of our knowledge and cannot always be established a priori. The discovery that the rising sun is not new every morning, but always the same, was one of the most fertile astronomical discoveries. Even to-day the identification of a small planet or a comet is not always a matter of course. Now if we were to regard equality as a relation between that which the names a' and b’ designate, it would seem that a = b could not differ from a = a (i.e. provided a = b is true). A relation would thereby be expressed of a thing to itself, and indeed one in which each thing stands to itself but to no other thing. What is intended to be said by a = b seems to be that the signs or names a' and b’ designate the same thing, so that those signs themselves would be under discussion; a relation between them would be asserted. But this relation would hold between the names or signs only in so far as they named or designated something. It would be mediated by the connexion of each of the two signs with the same designated thing. But this is arbitrary. Nobody can be forbidden to use any arbitrarily producible event or object as a sign for something. In that case the sentence a = b would no longer refer to the subject matter, but only to its mode of designation; we would express no proper knowledge by its means. But in many cases this is just what we want to do. If the sign a' is distinguished from the sign b’ only as object (here, by means of its shape), not as sign (i.e. not by the manner in which it designates something), the cognitive value of a = a becomes essentially equal to that of a = b, provided a = b is true. A difference can arise only if the difference between the signs corresponds to a difference in the mode of presentation of that which is designated. Let a, b, c be the lines connecting the vertices of a triangle with the midpoints of the opposite sides. The point of intersection of a and b is then the same as the point of intersection of b and c. So we have different designations for the same point, and these names (point of intersection of a and b,' point of intersection of b and c’) likewise indicate the mode of presentation; and hence the statement contains actual knowledge.
It is natural, now, to think of there being connected with a sign (name, combination of words, letter), besides that to which the sign refers, which may be called the reference of the sign, also what I should like to call the sense of the sign, wherein the mode of presentation is contained. In our example, accordingly, the reference of the expressions the point of intersection of a and b' and the point of intersection of b and c’ would be the same, but not their senses. The reference of evening star' would be the same as that of morning star,′ but not the sense. (...)
He argues that identity expressed between names is a relationship between their “modes of presentation” (senses/meanings) rather than between their names or reference object(s). This makes sense because different names can be synonymous or not, and only if they are we can infer equality, while in the latter case whether they are equal or unequal has to be established by other means.
Sorry, I used the word “definition” sloppily there. I don’t think we disagree with each other.
I meant something closer to “how equality is formalized in first order logic”. That’s what the bit about the axiom schemas was referencing: it’s how we bake in all the properties we require of the special binary predicate “=”. There’s a big, infinite core of axiom schemas specifying how “=” works that’s retained across FOLs, even as you add/remove character, relation and function symbols to the language.
But I’m pretty sure all instances of the axiom schema
x=y→ϕ(...,x,...)=ϕ(...,y,...)
(for any FOL-definable predicate ϕ in our theory) are already implied if we, as is customary, assume as logic FOL+identity, i.e. first-order predicate logic with a primitive logical predicate for identity. So in that case we don’t need the axiom schema. (And in second and higher order logic we need neither an axiom schema nor a primitive relation, because in that case identity is already definable in pure logic, without any theory, so we have a logical identity relation without having it to add as a primitive.) If we just assume FOL alone, adding this axiom schema to a particular first-order theory makes sense, but I conjecture that it is not equivalent to full (primitive or second-order definable) identity, similar to how the axiom schema of induction is not equivalent to the induction axiom, which requires second-order logic.
I’m confused. In any FOL, you have a bunch of “logical axioms” which come built-in to the language, and axioms for whatever theories you want to investigate in said language. You need these or else you’ve got no way prove basically anything in the language, since your deduction rules are: state an axiom from your logical axioms, from your assumed theory’s axioms, or Modus Ponens. And the logical axioms include a number of axioms schemas, such as the ones for equality that I describe, no?
Yes, with Hilbert proof systems, since those have axioms / axiom schemata. (In natural deduction systems there are only inference rules like Modus ponens, no logical axioms.) But semantically, a “primitive” identity symbol is commonly already interpreted to be the real identity, which would imply the truth of all instances of those axiom schemes. Though syntactically, for the proof system, you indeed still need to handle equality in FOL, either with axioms (Hilbert) or special inference rules (natural deduction).
These syntactical rules are weaker in FOL however than the full (semantic) notion of identity. Because they only infer that all “identical” objects have all first-order definable predicates in common, which doesn’t cover all possible properties, and which holds also for weaker forms of equality (“first-order equivalence”).
Eliezer mentioned the predicate “has finitely many predecessors” as an example of a property that is only second-order definable. So two distinct objects could have all their first-order definable properties in common while not being identical. The first-order theory wouldn’t prove that they are different. The second-order definition of identity, on the other hand, ranges over all properties rather than over all first-order definable ones, so it captures real identity.
Right, if we rely on this notion of “the real identity”. I think discussing that would get into even more confusing territory than just focusing on formalizations that look like they’re some kind of equality.
I don’t know about category theory, but identity can’t be defined in first-order logic, so it is usually added as a primitive logical predicate to first-order logic. Adding infinite axiom schemas of the sort you showed above don’t qualify as a definition, which has to be a finite biconditional. But identity is definable in pure second-order logic. Using Leibniz’s law, one can define
x=y:↔∀X(X(x)↔X(y))
Intuitively, x and y are defined identical iff they have all their properties in common. Formally, there is always a subset of the domain in the range of X and Y which for any object o contains only o. This guarantees that the definition cannot accidentally identify different objects.
Yes and that is because in predicate logic, different names (constant symbols) are allowed to refer to the same object or to different objects, while the same names always refer to the same object. This is similar to natural language where one thing can have multiple names. Frege (the guy who came up with modern predicate logic) has written about this in an 1892 paper:
He argues that identity expressed between names is a relationship between their “modes of presentation” (senses/meanings) rather than between their names or reference object(s). This makes sense because different names can be synonymous or not, and only if they are we can infer equality, while in the latter case whether they are equal or unequal has to be established by other means.
Sorry, I used the word “definition” sloppily there. I don’t think we disagree with each other.
I meant something closer to “how equality is formalized in first order logic”. That’s what the bit about the axiom schemas was referencing: it’s how we bake in all the properties we require of the special binary predicate “=”. There’s a big, infinite core of axiom schemas specifying how “=” works that’s retained across FOLs, even as you add/remove character, relation and function symbols to the language.
But I’m pretty sure all instances of the axiom schema
x=y→ϕ(...,x,...)=ϕ(...,y,...)
(for any FOL-definable predicate ϕ in our theory) are already implied if we, as is customary, assume as logic FOL+identity, i.e. first-order predicate logic with a primitive logical predicate for identity. So in that case we don’t need the axiom schema. (And in second and higher order logic we need neither an axiom schema nor a primitive relation, because in that case identity is already definable in pure logic, without any theory, so we have a logical identity relation without having it to add as a primitive.) If we just assume FOL alone, adding this axiom schema to a particular first-order theory makes sense, but I conjecture that it is not equivalent to full (primitive or second-order definable) identity, similar to how the axiom schema of induction is not equivalent to the induction axiom, which requires second-order logic.
I’m confused. In any FOL, you have a bunch of “logical axioms” which come built-in to the language, and axioms for whatever theories you want to investigate in said language. You need these or else you’ve got no way prove basically anything in the language, since your deduction rules are: state an axiom from your logical axioms, from your assumed theory’s axioms, or Modus Ponens. And the logical axioms include a number of axioms schemas, such as the ones for equality that I describe, no?
Yes, with Hilbert proof systems, since those have axioms / axiom schemata. (In natural deduction systems there are only inference rules like Modus ponens, no logical axioms.) But semantically, a “primitive” identity symbol is commonly already interpreted to be the real identity, which would imply the truth of all instances of those axiom schemes. Though syntactically, for the proof system, you indeed still need to handle equality in FOL, either with axioms (Hilbert) or special inference rules (natural deduction).
These syntactical rules are weaker in FOL however than the full (semantic) notion of identity. Because they only infer that all “identical” objects have all first-order definable predicates in common, which doesn’t cover all possible properties, and which holds also for weaker forms of equality (“first-order equivalence”).
Eliezer mentioned the predicate “has finitely many predecessors” as an example of a property that is only second-order definable. So two distinct objects could have all their first-order definable properties in common while not being identical. The first-order theory wouldn’t prove that they are different. The second-order definition of identity, on the other hand, ranges over all properties rather than over all first-order definable ones, so it captures real identity.
Right, if we rely on this notion of “the real identity”. I think discussing that would get into even more confusing territory than just focusing on formalizations that look like they’re some kind of equality.