Preliminaries

This is a continuation of my previous posts on infinity.

It’s a bit out of order in that I tried to write this post with a lot of introductory material and it just bogged it down too much. I’ll attempt to backfill later with a more introductory post, or at least to find something else to link to. For now though if you don’t know what the following words mean you’re probably going to struggle. Sorry. :-(

You should understand the concepts of:

equivalence relation

equivalence class

partial ordering

total ordering

order isomorphism

It would be useful to have previously encountered well-orderings, but I’ll introduce the bits of those we need.

Actual content

So far it’s very non-obvious how we might go about proving that for any two infinite sets, \(|A| \leq |B|\) or \(|B| \leq |A|\). Even proving the Cantor-Schroeder-Bernstein theorem was really hard work, and we had some concrete objects to work with in the form of the injections in each direction. Given two completely unknown and apparently unrelated infinite sets, where would we even begin?

The problem is that just a set on its own has very little structure to it – it’s just a collection of stuff. Without some sort of structure to get a handle on there’s not a lot we can do to it. So, for now, we will park the question of arbitrary sets and focus on a specific class of sets that will prove fruitful.

Specifically the sets we will consider are those which have a total ordering relationship on them which is well ordered. That is, every non-empty subset of them has a smallest element. The reason this is a useful property is that it allows us to generalise induction on the natural numbers: Given a property, we can prove that it holds for all elements of a well ordered set by proving that if it holds for all smaller elements then it holds for the current one. This will be invaluable. It will also turn out that there is a rich class of sufficiently large well ordered sets that will allow us to get a handle on infinity.

Why do we want well-ordered sets?

Well because we can do induction on them. Suppose \(X\) is a well-ordered set and \(p\) is some logical property such that if \(p(y)\) for all \(y < x\) then \(p(x)\). Then \(p\) holds for all \(x \in X\). Why? Well, if not then the set \(\{x \in X : \mathrm{not} p(x)\}\) is non-empty, so it has a smallest element \(x\). But then for \(y < x\) we have \(p(y)\), which by hypothesis implies \(p(x)\) (note: \(x\) can be the smallest element of \(X\) here, in which case this holds vacuously). This contradicts the existence of \(x\), hence the non-emptiness of that set. Hence \(p(x)\) for all \(x \in X\).

We can extend this further: We can also define functions inductively.

Suppose we have some function \(h : P(Y) \to Y\) and a well-ordered set \(X\). There is a unique function \(f : X \to Y\) such that \(f(x) = h(\{ f(y) : y < x\})\). Why? Proof by induction: Let \(p(x)\) be the property that there is a unique such function on the set \(\{y : y \leq x\}\). If such a function exists call it \(f_x\). For \(y < x\) let \(f_x(y) = f_y(y)\). Let \(f_x(x) = h(\{f_y(y) : y \in X\})\). I’m going to leave checking the details of this as an exercise for the interested reader because I’m a big jerk.

It turns out there is a strict hierarchy of well ordered sets.

Let \(X\) be a well ordered set. \(A \subseteq X\) is called a segment (this terminology is slightly non-standard) if for \(x \in A\) and \(y < x\) we must have \(y \in A\). It is called an initial segment (this terminology is not) if it is of the form \(I_a = \{x : x < a \}\) for some \(a\).

We’ll basically be interested in when one well-ordered set embeds as a segment of another. First we need to get a handle on the structure of segments:

Lemma: Let \(X\) be well-ordered and \(A \subseteq X\) be a segment. Then either \(A = X\) or \(A\) is an initial segment.

Proof: Suppose \(A

eq X\). Then the set \(X \setminus A\) is non-empty. Let \(a\) be the minimal element of this set. Then if \(x < a\) we must have \(x \in A\) by minimality. If \(y \geq a\) and \(y \in A\) then we must have \(a \in A\), because \(A\) is a segment. Therefore \(x \in A\) if and only if \(x < a\), i.e. \(A = I_a\). QED

Lemma: Let \(A \subeteq B \subseteq C\). Let \(B\) be a segment of \(C\) and \(A\) a segment of \(B\). Then \(A\) is a segment of \(C\).

Proof: This is mostly just definition chasing. Let \(a \in A\) and \(c \in C\) with \(c < a\). Then because \(a \in B\) we have \(c \in B\) (because \(B\) is a segment of \(C\)\). Therefore \(c \in A\), because \(A\) is a segment of \(B\).

The following relationships will be at the heart of how well-ordered sets interact with eachother:

Let \(A, B\) be well ordered sets. Write \(A \preceq B\) if \(A\) is isomorphic to a segment of \(B\). Write \(A \sim B\) if \(A\) and \(B\) are isomorphic. Write \(A \prec B\) if \(A \preceq B\) and not \(A \sim B\).

Proposition: If \(A \preceq B \preceq C\) then \(A \preceq C\).

Proof: Let \(f : A \to B\) and \(g : B \to C\) be order isomorphisms onto segments. Then \(h = g \cdot f\) is an order-isomorphism \(h : A \to h(A) \subseteq C\). We need only show that \(h(A)\) is a segment.

Because \(f(A)\) is a segment and \(g\) is an order-isomorphism we know that \(h(A) = g(f(A))\) is a segment of \(g(B)\). But a segment of a segment is a segment, so \(h(A)\) is a segment of \(C\) and we’re done. QED

Clearly if \(A \prec B\) then \(A\) must be isomorphic to an initial segment of \(B\) (because it’s isomorphic to a segment which isn’t \(B\)). This is also a sufficient condition:

Theorem: A well-ordered set is not isomorphic to any of its initial segments.

Proof:

Suppose \(f : A \to I_a\) is order preserving and injective. Then necessarily \(f(a) < a\) (because \(f(a) \in I_a\). But then because it’s order preserving and injective, \(f(f(a)) < f(a)\). Indeed, \(a > f(a) > \ldots > f^n(a) > \ldots\).

This means that the set \(\{ f^n(a) : n \in \mathbb{N}\}\) has no minimal element, contradicting the well-orderedness of A. QED

Corollary: \(A \prec B\) if and only if \(A\) is order isomorphic to an initial segment of \(A\).

Corollary: If \(A \preceq B\) and \(B \preceq A\) then \(A \sim B\).

Proof: Otherwise \(A \preceq B \sim I_a\).

Corollary: If \(A \prec B\) then \(A\) is isomorphic to a unique initial segment of \(B\) (because if it were isomorphic to two, the larger of the two would be isomorphic to an initial segment of itself)

So basically \(\preceq\) is a partial order on equivalence classes of well-orderings. i.e. it’s a partial order except for instead of equality in the anti-symmetric clause we have isomorphic.

It turns out it’s actually a total order:

Theorem: Let \(A\), \(B\) be well-ordered sets. Then \(A \preceq B\) or \(B \preceq A\).

Proof:

Let \(T\) be some element not in \(B\). We’ll define a function \(f : A \to B^+ = B \cup \{T\} \) by transfinite induction as follows: Let \(g : \mathcal{P}(B^+) : \to \B^+\) be defined by \(g(U) = \mathrm{min}(B \setminus U)\) if \(B \setminus U\) is non-empty, else \(g(U) = T\). Let \(f\) be the function this defines. i.e. if \(f(a) \in B\) then \(f(a) = \mathrm{min} B \setminus f(I_a)\).

The idea is that \(T\) is our “stopping point” which says “hey I’ve run out of elements of B”. If we ever hit it then we’ve covered \(B\) with an order preserving isomorphism from some segment of \(A\). If we don’t then we’ve embedded \(A\) in a segment of \(B\).

Specifically, let \(S_A = \{a \in A : f(a)

eq T\}\) and let \(S_B = f(S_a)\). We’ll show that both \(S_A\) and \(S_B\) are segments and at least one of \(S_A = A\) or \(S_B = B\) holds.

Proof that \(S_B\) is a segment: Suppose \(f(a) = b\) and \(b’ < b\). Then because we chose \(b\) to be minimal such that \(f(x)

eq b\) for \(x < a\), we must have that \(f(x) = b’\) for some \(x < a\), else \(b’\) would be an example contradicting minimality.

Proof that \(S_A\) is a segment: If \(a \in S_A\) and \(a’ < a\) then \(f(a) \in B \setminus f(I_{a’})\) so \(f(a’)

eq T\).

Now suppose \(S_B

eq B\). Then there exists \(b \in B \setminus S_B\). But then \(b \in B \setminus f(I_a)\) for all \(a \in A\), so this set is never empty and thus we never have \(f(a) = T\). Hence \(A = S_A\).

\(f|_{S_A}\) is injective by construction because if \(a’ < a\) we always choose \(f(a)

eq f(a’)\). We need only show that it’s order preserving.

But the same proof that \(S_B\) was a segment will show that \(f(I_a)\) is a segment for any \(a\). Thus if \(a < a’\) we must have \(f(a’) > f(a)\), as else \(f(a’) \in f(I_a)\) which would contradict injectivity.

Therefore \(f|_{S_A}\) is order-preserving and injective, and so is an order isomorphism between \(S_A\) and \(S_B\). If \(S_A = A\) then \(A \preceq B\). Else \(S_B = B\) and so \((f_{S_A})^-1 : B \to S_A\) demonstrates that \(B \preceq A\). QED

So what does this all have to do with sizes of infinity then?

Well we’ve demonstrated that for any two well ordered sets there is an injection from one to the other. So if every infinite set can be well-ordered we must have that for any two sets \(A, B\), \(|A| \leq |B|\) or \(|B| \leq |A|\).

Can every infinite set be well-ordered? It seems highly implausible.

It turns out the answer is “sort of”. There is an axiom of set theory which asks for a very natural structure on infinite sets which will be enough to guarantee a well-ordering of every set.

But we don’t need that to demonstrate its plausibility. It turns out that if we want it to be true that \(|A| \leq |B|\) or \(|B| \leq |A|\) always holds we must have every set be able to be well-ordered.

Why? Well because for any set \(A\) there is a natural example of a well-ordered set \(W(A)\) such that \(W(A)

ot\leq A\). i.e. there is no injection \(W(A) \to A\). Thus if we want all cardinals to be comparable we must have an injection \(f : A \to W(A)\). But then we can define a well-ordering on \(A\) by \(x \leq y\) iff \(f(x) \leq f(y)\).

Lets construct such a set:

Theorem: Let \(A\) be any infinite set. Let \(W(A)\) be the set of isomorphism classes of well-orderings of subsets of \(A\). Define \([A] \leq [B]\) if \(A \preceq B\). This is well-defined because of how \(\preceq\) interacts with isomorphism. Then \(W(A)\) is a well-ordered set and there is no injective function \(f : W(A) \to A\).

Proof:

We’ve already shown that it’s a totally ordered set by our previous results. We need only show that it’s well-ordered.

It suffices to show that every initial segment of \(W(A)\) is well-ordered: Then if \(U \subseteq W(A)\) and \(u \in U\), either \(u\) is a minimal element of \(U\) or \(I_u \cap U\) is a non-empty subset of \(I_u\) and thus has a minimal element which is in turn a minimal element of \(U\).

So let \(x \in W(A)\). We can write \(x = [T]\) for some well ordered set. Claim: \(I_x \sim T\).

Proof: \(y < x\) iff \(y = [S]\) for some \(S \prec T\) by definition of our ordering on \(W(A)\). Thus there is a unique initial segment of \(T\) such that \(S \sim I_s\). Define \(f : I_x \to T\) by mapping \([S]\) to the element corresponding to the unique initial segment isomorphic to \(S\). i.e. if \(S \sim I_t\) then \(f([S]) = t\).

This is injective because no two initial segments a well-ordered set are isomorphic. It’s order preserving because if \([S’] \prec [S]\) then we have an order-isomorphism between \(S’\) and an initial segment of \(S\). If \(f(S) = x\) we can compose this with the order isomorphism we’ve got \(S \to I_x\) and get an order isomorphism between \(S’\) and an initial segment of \(I_x\). This means that \(S’ \sim I_y \subseteq I_x\) and so necessarily \(f(S’) = y < x\).

This is surjective because if \(S\) is a well-ordering of some subset of \(A\) then so are all its initial segments, and \([I_s] < [S]\) for \(s \in S\). This proves our claim: Every initial segment of \(W(A)\) is isomorphic to a well-ordered set.

So we now know that \(W(A)\) is a well-ordered set. We must show that there is no injection \(f : W(A) \to A\).

Suppose such an \(f\) exists. Then we can define a well-ordering on the set \(T = f(W(A))\) by \(f(x) < f(y)\) when \(x < y\).

But then \([T] \in W(A)\), and by what we proved in the claim we have \(I_{[T]} \tilde W(A)\). But a well-ordered set cannot be order isomorphic to an initial segment of itself. This contradicts the existence of such a function.

QED

So we now know that there are “arbitrarily large” well-ordered sets, and so if we want all cardinalities to be comparable in fact all sets can be well-ordered. This gives a strong plausibility argument for the idea that this is possible. But how might we prove it?

Well we still run into that problem I mentioned at the beginning: Arbitrary sets have very little structure to play with. The solution is to demand that they have slightly more structure.

There is a set theoretic axiom called the axiom of choice. Historically it was somewhat controversial. These days it’s basically considered entirely uncontroversial and modern set theory is very much mostly the study of set theories which obey the axiom of choice. Simply put, the axiom is this:

Let \(X\) be a non-empty set. Then there exists a choice function \(f : P(X) \setminus \{\emptyset\} \to X\) such that \(f(U) \in U\).

i.e. we have some way of picking an “arbitrary” member of any given non-empty set. If you like your intuition can be that we pick at random, though it turns out to be hard to make that intuition make rigorous sense.

Given the axiom of choice, we can now straightforwardly use transfinite induction to construct an injection \(A \to W(A)\), which will give us a well-ordering on \(A\).

Theorem: A set \(A\) can be well-ordered if and only if there is a choice function on \(P(A) \setminus \{\emptyset\}\). In particular given the axiom of choice every set can be well-ordered.

Proof:

If \(A\) can well ordered then the function \(h(U) = \min U\) is a choice function. So that takes care of the “only if” part.

We now must construct a well-order from a choice function. We’ll do this by bijecting \(A\) with a subset of \(W(A)\). This will let us inherit the well order from it.

Let \(h : P(X) \to X\) be a choice function extended with \(h(\emptyset) = h(X)\) (it doesn’t really matter what we do with the empty-set).

Let \(f : W(A) \to A\) be the function defined by induction as \(f(x) = h(X \setminus f(I_x))\).

By our previous theorem this cannot be an injection. But if \(X \setminus f(I_x)\) is never empty then this must be an injection because otherwise we always choose \(f(x)

eq f(y)\) for \(y < x\).

Pick the minimal \(x\) such that \(f(I_x) = X\). Then \(f|_{I_x} : I_x \to A\) is a bijection, so its inverse is an injection \(A \to W(A)\) and the result is proved.

QED

And this more or less concludes what I wanted to cover in this post: Every set can be well-ordered, and thus any two cardinalities are comparable.

I will finish by now finally explaining what \(\aleph_1\) means.

Suppose \(A\) is any uncountable well-ordered set. Let \(x = \mathrm{min} \{a \in A : |I_a| > \aleph_0\}\).

Then by construction \(|I_x| > \aleph_0\) but any initial segment of \(I_x\) has cardinality \(\aleph_0\). This means that regardless of what well-ordered set we started out with the results will be isomorphic: One must be isomorphic to a segment of the other, and the initial segments are all countable so they must be isomorphic to the whole set.

This means additionally that for any \(|B| > \aleph_0\) we must have \(|I_x| \leq |B|\): Well-order \(B\) and perform the same construction. Then \(I_x\) is order isomorphic to a segment of \(B\) and thus there is an injection \(I_x \to B\).

We write \(|I_x| = \aleph_1\): It is the smallest size an uncountable set can be.

I still haven’t told you what these mystery \(\aleph\)s mean of course, but you’ll have to wait till next time for that one.