I’ve had another look at the argument, and spotted a way to remove the step that relies on “if x C z and y P z then x C y”, loosely “anything which causes the whole, causes the part”. Since there appear to be counterexamples to that as a causal intuiton, it’s a good idea to try to eliminate the step, even for a constructed relation based on C.
Here is a new version, which uses an alternative relation C’. The relation is still constructed on “entities”, under the assumption it makes some sense to talk of one entity as a cause of another (it could probably be adapted to other relata of causation like “events” or “states of affairs”). The conclusion of the argument turns out to be a bit weaker than before, but in a rather interesting way.
A1. The collection of all entities is a set E, with a causal relation C and a partial order P, such that x P y if and only if x is a part of y.
Note: This ensures we can apply Zorn’s Lemma when considering chains in E, but is not as strong as the full Axiom of Choice. If the set E is finite or countable, for instance, then A2 applies automatically.
Definitions: We define a relation C’ such that x C’ y iff there is some z which is part of x, but not part of y, and which satisfies z C y.
Note: This gives a relation which automatically satisfies “if x C’ y and x P w then w C’ y”, loosely “anything which is caused by a part is caused by the whole”. Also, C’ is guaranteed to be irreflexive since it is impossible to have x C’ x (no z can be simultaneously part of x and not part of x), loosely “no entity is a cause of itself”. These are plausible conditions on the underlying relation C, in which case we have C’ = C, but the construction of C’ means we don’t need to state these conditions as extra premises. We will see below that something else interesting can happen if C does not meet these conditions.
We then define a further relation ⇐ such that x ⇐ y iff x = y, or there are finitely many entities x1, …, xn such that x1 = x, xn = y and xi C’ xi+1 for i=1.. n-1.
Note: This construction ensures that ⇐ is a pre-order on E.
Say that x is “causally dependent” on y iff x is distinct from y but x ⇐ y; say that y is “dependent” iff there is some x on which y is causally dependent, and “independent” otherwise (this is equivalent to saying that any z ⇐ y satisfies z = y). Say that y is “wholly independent” if and only if y is not part of any dependent entity (which implies that y itself is independent). Say that a subset S of E is a “chain” iff for any x, y in S we have x ⇐ y or y ⇐ x. Say that S is an “endless chain” iff for any x in S there is some y in S such that x is causally dependent on y. Say that x is a “proper part” of y iff x is distinct from y but x P y.
A3. For any endless chain S in E, there is some entity z in E such that every x in S is a proper part of z.
Informally, the “endless chain” describes an odd causal relationship between entities : either an infinite descending sequence of causes, or a circle of causes. A natural question to ask is how there could be such a chain, and this involves treating the chain itself as some sort of entity to be explained.
Notice that if there are no endless chains at all, then A3 is true vacuously. Also, A3 is true provided arbitrary collections of entities can be aggregated together to form an entity, but the argument doesn’t need to rely on such a strong assumption. Finally, A3 is true if there is a “maximum” entity w, one of which every other entity is a proper part; but again the argument doesn’t need to rely on such an assumption.
Lemma 1: For any chain S in E, there is some entity w in E such that w ⇐ y for every y in S.
Proof: Suppose S is not endless. Then there is some x in S such that for no other y in S is y ⇐ x. By the chain property we must have x ⇐ y for every member y of S. Alternatively, suppose that S is endless, then by A3, there is some z in E of which every x in S is a part. Now consider any y in S. There is some x not equal to y in S with x ⇐ y, so there is an entity x2, with x C’ x2 ⇐ y. Further, as x P z we have z C’ x2 ⇐ y and hence z ⇐ y.
Lemma 2: For any x in E, unless some w ⇐ x is wholly independent, then there is some dependent y ⇐ x, such that for any dependent z ⇐ y we have y ⇐ z.
Proof: This follows from Zorn’s Lemma for pre-orders. Consider the set Y = {y: y is dependent, and y ⇐ x}, and observe that for any chain S in Y, Lemma 1 gives some w with w ⇐ s for every s in S and also w ⇐ x. Either this w is wholly independent or it is part of some dependent y with y ⇐ s for every s in S and y ⇐ x, so that this y is also in Y. Zorn’s Lemma now implies that Y contains some element which is minimal with respect to ⇐ within Y. To conclude the proof, consider any other dependent z ⇐ y; then z ⇐ x as well, so that z is in Y and y ⇐ z.
Theorem 3: For any x in E, there is some w ⇐ x which is wholly independent.
Proof: Suppose that no w ⇐ x is wholly independent, take a y satisfying Lemma 2, and consider the set S = {s is dependent: s ⇐ y}. Since y is dependent, there is some w C’ y, and some dependent v with w P v, so that v C’ y, and hence v is not equal to y. Hence S contains at least two members, and by Lemma 2, y ⇐ s for every member of S, and so S is an endless chain. So by A3, and by assumption that no z ⇐ x is wholly independent, there is some dependent z of which every s in S is a proper part, which implies that z is not in S. But by the proof of Lemma 1, z ⇐ y, which implies z is in S: a contradiction. So it follows that some y ⇐ x is wholly independent.
Now for the “uniqueness” part. As before, we partition E into three subsets. I are the “inert” entities, which do not causally-depend on anything and have no causal dependencies themselves. (These could be abstract entities like numbers, propositions and so on). Formally I = {x in E: there is no y with x C’ y or y C’ x}. U are the “uncaused causes”—formally U = {x in E: there is no y with y C’ x, but there is z with x C’ z}. O are all the “other”, caused entities, so that formally O = {x in E: there is some y with y C’ x}. We will also let W be the “wholly uncaused causes”—formally W = {u in U: there is no x, y with u P y and x C’ y}.
B1. If S is any subset of W such that for any x, y in S we have x P y or y P x, (call such an S a “chain of parts”), then there is some entity z of which all members of S are parts.
B2. Suppose that y ⇐ x and z ⇐ x. Then there is some entity w ⇐ x such that: w ⇐ y or y P w; w ⇐ z or z P w.
Informally, the idea is that y and z can’t independently cause x without any further causal explanation. So there must be some common cause, however each of them may be part of that common cause.
Definition: Say that entities x and y are “causally-connected” if and only if x=y or there are finitely-many entities x=x1,..,xn=y with each xi C’ xi+1 or xi+1 C’ xi for i=1..n-1.
B3. Any two entities x, y in O are causally-connected.
Informally, O doesn’t “come apart” into disconnected components, such as a bunch of isolated universes. Premises B1-B3 turn out to be necessary for Theorem 6 to hold, as well as sufficient (see below). So they can’t be made any weaker!
Theorem 4: For any x in O, there is a unique entity f(x) in W such that: f(x) ⇐ x, and any y in U with y ⇐ x satisfies y P f(x).
Proof: For any x in O, define a subset W(x) = {y in W: y ⇐ x}; this is non-empty by Theorem 3. Consider any non-empty chain of parts S that is a subset of W(x). Since W(x) is a subset of W, then by B1, there is some z in E of which all members of S are parts, and such a z must also be in W. (If z is part of some dependent x then so is some s in S part of x, which is impossible because s is in W). Also since y ⇐ x for some member y of S, where y cannot be equal to x, and y P z, we have y C’ x2 ⇐ x for some x2, so z ⇐ x, as in the proof of Lemma 1, and z is also a member of W(x). Even if S is empty, we can take some z in W(x), and then trivially all members of S are parts of z. By application of Zorn’s Lemma to W(x), there is an element f(x) in W(x) which is maximal with respect to P i.e. there is no other y in W(x) with f(x) P y. Now, by B2, for any other y ⇐ x in U there must be some z ⇐ x with f(x) P z and y P z; this implies z is in W, and given f(x) is maximal in W(x) we have z = f(x) and so y P f(x). This makes f(x) the unique maximal element of W(x).
Theorem 5: For any x, y in O, f(x) = f(y) if and only if x and y are causally-connected.
Proof: It is clear that if f(x) = f(y) then x is causally-connected to y (just build a path backwards from x to f(x) and then forward again to y). Conversely, consider any two x, y in O:
a) If x C’ y, then f(x) is in U and satisfies f(x) ⇐ y so we have f(x) P f(y). Since x is not f(x), we have f(x) C’ x2 ⇐ x for some x2, and hence f(y) ⇐ x which means f(y) P f(x), and so f(x) = f(y).
b) If z is in U with z C’ x and z C’ y, then z P f(x) so f(x) ⇐ y and f(x) P f(y); similarly, f(y) P f(x) so that f(x) = f(y).
The result now follows by induction on the length of the causal path connecting x to y.
Theorem 6: If O is non-empty, then there is a single entity g in W such that: f(x) = g for every x in O, and y P g for every y in U.
Proof: Assuming O is non-empty, take any element y in O, and set g = f(y); then the result that f(x) = g for any x in O follows from Theorem 5 and B3; further, for any y in U, there is some x in O with y C’ x, so by Theorem 4, y P f(x). If there are no elements of O (meaning there are none in U either) then the Theorem is trivial.
Note that B1, B2 and B3 are entailed by the statement of Theorem 6. For B1, we can just take g as the relevant z. For B2, we can take g as the relevant w. B3 follows using using the first part of Theorem 5 (just track from x back to g, then forward to y again). The premises have been weakened a bit, but are still basically unjustifiable (we can imagine them being false without absurdity) and the above re-work shows they are needed for the uniqueness conclusion (Theorem 6).
One interesting fact is that Theorem 6 now has a loophole that wasn’t there before. If C is not equal to C’, then it is possible that there is some x C g, provided any such x is part of g (i.e. g contains any cause for itself within itself). This even allows Theorem 6 to be consistent with a premise like this:
B4: For any entity x, there is some y with y C x.
Informally, “every entity has a cause”, a premise which is usually considered a fatal inconsistency in a first cause argument! With a bit of renaming, the set O could be said to consist of “ordinary” entities (ones which have causes that are not parts of themselves) and the remaining non-inert entities are “extraordinary” entities (ones which contain all their own causes). W are “wholly extraordinary entities”, ones which are not part of any ordinary entity. Theorem 6 implies that every ordinary entity is causally dependent on a single wholly extraordinary entity, one which contains every other extraordinary entity.
Again, there aren’t any particularly theistic conclusions here, since g could very well be a maximum entity if there is one—say g is the whole universe or multiverse. In that case, g contains every ordinary entity as well as containing all the extraordinary entities.
I’ve had another look at the argument, and spotted a way to remove the step that relies on “if x C z and y P z then x C y”, loosely “anything which causes the whole, causes the part”. Since there appear to be counterexamples to that as a causal intuiton, it’s a good idea to try to eliminate the step, even for a constructed relation based on C.
Here is a new version, which uses an alternative relation C’. The relation is still constructed on “entities”, under the assumption it makes some sense to talk of one entity as a cause of another (it could probably be adapted to other relata of causation like “events” or “states of affairs”). The conclusion of the argument turns out to be a bit weaker than before, but in a rather interesting way.
A1. The collection of all entities is a set E, with a causal relation C and a partial order P, such that x P y if and only if x is a part of y.
A2. The set E can be well-ordered.
Note: This ensures we can apply Zorn’s Lemma when considering chains in E, but is not as strong as the full Axiom of Choice. If the set E is finite or countable, for instance, then A2 applies automatically.
Definitions: We define a relation C’ such that x C’ y iff there is some z which is part of x, but not part of y, and which satisfies z C y.
Note: This gives a relation which automatically satisfies “if x C’ y and x P w then w C’ y”, loosely “anything which is caused by a part is caused by the whole”. Also, C’ is guaranteed to be irreflexive since it is impossible to have x C’ x (no z can be simultaneously part of x and not part of x), loosely “no entity is a cause of itself”. These are plausible conditions on the underlying relation C, in which case we have C’ = C, but the construction of C’ means we don’t need to state these conditions as extra premises. We will see below that something else interesting can happen if C does not meet these conditions.
We then define a further relation ⇐ such that x ⇐ y iff x = y, or there are finitely many entities x1, …, xn such that x1 = x, xn = y and xi C’ xi+1 for i=1.. n-1.
Note: This construction ensures that ⇐ is a pre-order on E.
Say that x is “causally dependent” on y iff x is distinct from y but x ⇐ y; say that y is “dependent” iff there is some x on which y is causally dependent, and “independent” otherwise (this is equivalent to saying that any z ⇐ y satisfies z = y). Say that y is “wholly independent” if and only if y is not part of any dependent entity (which implies that y itself is independent). Say that a subset S of E is a “chain” iff for any x, y in S we have x ⇐ y or y ⇐ x. Say that S is an “endless chain” iff for any x in S there is some y in S such that x is causally dependent on y. Say that x is a “proper part” of y iff x is distinct from y but x P y.
A3. For any endless chain S in E, there is some entity z in E such that every x in S is a proper part of z.
Informally, the “endless chain” describes an odd causal relationship between entities : either an infinite descending sequence of causes, or a circle of causes. A natural question to ask is how there could be such a chain, and this involves treating the chain itself as some sort of entity to be explained.
Notice that if there are no endless chains at all, then A3 is true vacuously. Also, A3 is true provided arbitrary collections of entities can be aggregated together to form an entity, but the argument doesn’t need to rely on such a strong assumption. Finally, A3 is true if there is a “maximum” entity w, one of which every other entity is a proper part; but again the argument doesn’t need to rely on such an assumption.
Lemma 1: For any chain S in E, there is some entity w in E such that w ⇐ y for every y in S.
Proof: Suppose S is not endless. Then there is some x in S such that for no other y in S is y ⇐ x. By the chain property we must have x ⇐ y for every member y of S. Alternatively, suppose that S is endless, then by A3, there is some z in E of which every x in S is a part. Now consider any y in S. There is some x not equal to y in S with x ⇐ y, so there is an entity x2, with x C’ x2 ⇐ y. Further, as x P z we have z C’ x2 ⇐ y and hence z ⇐ y.
Lemma 2: For any x in E, unless some w ⇐ x is wholly independent, then there is some dependent y ⇐ x, such that for any dependent z ⇐ y we have y ⇐ z.
Proof: This follows from Zorn’s Lemma for pre-orders. Consider the set Y = {y: y is dependent, and y ⇐ x}, and observe that for any chain S in Y, Lemma 1 gives some w with w ⇐ s for every s in S and also w ⇐ x. Either this w is wholly independent or it is part of some dependent y with y ⇐ s for every s in S and y ⇐ x, so that this y is also in Y. Zorn’s Lemma now implies that Y contains some element which is minimal with respect to ⇐ within Y. To conclude the proof, consider any other dependent z ⇐ y; then z ⇐ x as well, so that z is in Y and y ⇐ z.
Theorem 3: For any x in E, there is some w ⇐ x which is wholly independent.
Proof: Suppose that no w ⇐ x is wholly independent, take a y satisfying Lemma 2, and consider the set S = {s is dependent: s ⇐ y}. Since y is dependent, there is some w C’ y, and some dependent v with w P v, so that v C’ y, and hence v is not equal to y. Hence S contains at least two members, and by Lemma 2, y ⇐ s for every member of S, and so S is an endless chain. So by A3, and by assumption that no z ⇐ x is wholly independent, there is some dependent z of which every s in S is a proper part, which implies that z is not in S. But by the proof of Lemma 1, z ⇐ y, which implies z is in S: a contradiction. So it follows that some y ⇐ x is wholly independent.
Now for the “uniqueness” part. As before, we partition E into three subsets. I are the “inert” entities, which do not causally-depend on anything and have no causal dependencies themselves. (These could be abstract entities like numbers, propositions and so on). Formally I = {x in E: there is no y with x C’ y or y C’ x}. U are the “uncaused causes”—formally U = {x in E: there is no y with y C’ x, but there is z with x C’ z}. O are all the “other”, caused entities, so that formally O = {x in E: there is some y with y C’ x}. We will also let W be the “wholly uncaused causes”—formally W = {u in U: there is no x, y with u P y and x C’ y}.
B1. If S is any subset of W such that for any x, y in S we have x P y or y P x, (call such an S a “chain of parts”), then there is some entity z of which all members of S are parts.
B2. Suppose that y ⇐ x and z ⇐ x. Then there is some entity w ⇐ x such that: w ⇐ y or y P w; w ⇐ z or z P w.
Informally, the idea is that y and z can’t independently cause x without any further causal explanation. So there must be some common cause, however each of them may be part of that common cause.
Definition: Say that entities x and y are “causally-connected” if and only if x=y or there are finitely-many entities x=x1,..,xn=y with each xi C’ xi+1 or xi+1 C’ xi for i=1..n-1.
B3. Any two entities x, y in O are causally-connected.
Informally, O doesn’t “come apart” into disconnected components, such as a bunch of isolated universes. Premises B1-B3 turn out to be necessary for Theorem 6 to hold, as well as sufficient (see below). So they can’t be made any weaker!
Theorem 4: For any x in O, there is a unique entity f(x) in W such that: f(x) ⇐ x, and any y in U with y ⇐ x satisfies y P f(x).
Proof: For any x in O, define a subset W(x) = {y in W: y ⇐ x}; this is non-empty by Theorem 3. Consider any non-empty chain of parts S that is a subset of W(x). Since W(x) is a subset of W, then by B1, there is some z in E of which all members of S are parts, and such a z must also be in W. (If z is part of some dependent x then so is some s in S part of x, which is impossible because s is in W). Also since y ⇐ x for some member y of S, where y cannot be equal to x, and y P z, we have y C’ x2 ⇐ x for some x2, so z ⇐ x, as in the proof of Lemma 1, and z is also a member of W(x). Even if S is empty, we can take some z in W(x), and then trivially all members of S are parts of z. By application of Zorn’s Lemma to W(x), there is an element f(x) in W(x) which is maximal with respect to P i.e. there is no other y in W(x) with f(x) P y. Now, by B2, for any other y ⇐ x in U there must be some z ⇐ x with f(x) P z and y P z; this implies z is in W, and given f(x) is maximal in W(x) we have z = f(x) and so y P f(x). This makes f(x) the unique maximal element of W(x).
Theorem 5: For any x, y in O, f(x) = f(y) if and only if x and y are causally-connected.
Proof: It is clear that if f(x) = f(y) then x is causally-connected to y (just build a path backwards from x to f(x) and then forward again to y). Conversely, consider any two x, y in O:
a) If x C’ y, then f(x) is in U and satisfies f(x) ⇐ y so we have f(x) P f(y). Since x is not f(x), we have f(x) C’ x2 ⇐ x for some x2, and hence f(y) ⇐ x which means f(y) P f(x), and so f(x) = f(y).
b) If z is in U with z C’ x and z C’ y, then z P f(x) so f(x) ⇐ y and f(x) P f(y); similarly, f(y) P f(x) so that f(x) = f(y).
The result now follows by induction on the length of the causal path connecting x to y.
Theorem 6: If O is non-empty, then there is a single entity g in W such that: f(x) = g for every x in O, and y P g for every y in U.
Proof: Assuming O is non-empty, take any element y in O, and set g = f(y); then the result that f(x) = g for any x in O follows from Theorem 5 and B3; further, for any y in U, there is some x in O with y C’ x, so by Theorem 4, y P f(x). If there are no elements of O (meaning there are none in U either) then the Theorem is trivial.
Note that B1, B2 and B3 are entailed by the statement of Theorem 6. For B1, we can just take g as the relevant z. For B2, we can take g as the relevant w. B3 follows using using the first part of Theorem 5 (just track from x back to g, then forward to y again). The premises have been weakened a bit, but are still basically unjustifiable (we can imagine them being false without absurdity) and the above re-work shows they are needed for the uniqueness conclusion (Theorem 6).
One interesting fact is that Theorem 6 now has a loophole that wasn’t there before. If C is not equal to C’, then it is possible that there is some x C g, provided any such x is part of g (i.e. g contains any cause for itself within itself). This even allows Theorem 6 to be consistent with a premise like this:
B4: For any entity x, there is some y with y C x.
Informally, “every entity has a cause”, a premise which is usually considered a fatal inconsistency in a first cause argument! With a bit of renaming, the set O could be said to consist of “ordinary” entities (ones which have causes that are not parts of themselves) and the remaining non-inert entities are “extraordinary” entities (ones which contain all their own causes). W are “wholly extraordinary entities”, ones which are not part of any ordinary entity. Theorem 6 implies that every ordinary entity is causally dependent on a single wholly extraordinary entity, one which contains every other extraordinary entity.
Again, there aren’t any particularly theistic conclusions here, since g could very well be a maximum entity if there is one—say g is the whole universe or multiverse. In that case, g contains every ordinary entity as well as containing all the extraordinary entities.