Recently several of us have been making a lot of noise about “structural set theory,” also known as “categorical” or “categorial” set theory. This phrase refers to a general class of theory akin to Lawvere’s ETCS, which describes sets from a purely “structural” point of view. A brief way to say this is that we care only about the category of sets, in which isomorphic sets are indistinguishable, rather than the class of sets equipped with a global membership relation ∈ \in . In structural set theory, which claims to be closer to the way sets are actually used in most mathematics, it doesn’t make sense to ask whether (for instance) 3 ∈ 17 3\in 17 , since 3 3 and 17 17 are elements of the set N N but not sets themselves. Rather than coming with an intricate set-membership structure, sets in structural set theory are simply the “raw material” with which we build mathematical structures such as groups, rings, spaces, manifolds—and even “set-membership structures”!

This is sort of a continuation of this post, but it should also stand mostly alone. I’ll also summarize some of the things I learned from this previous discussion.

I believe the word structural originally comes from the philosophy of mathematics called “structuralism,” according to which

mathematical theories… describe structures. … Structures consist of places that stand in structural relations to each other.

One prominent exponent of this philosophy was Paul Benacerraf, who pointed out in his paper “What numbers could not be” that set-theoretic foundations such as ZFC do not adequately describe the way we really think about mathematical objects such as numbers. In ZFC, one has to define the number 3 3 as some particular set, such as the von Neumann ordinal { ∅ , { ∅ } , { ∅ , { ∅ } } } \{\emptyset, \{\emptyset\}, \{\emptyset,\{\emptyset\}\}\} . But this then allows us to ask questions such as “is 17 ∈ 3 17\in 3 ?” (which, with this definition, is false) or “is 3 ∈ 17 3\in 17 ?” (which, with this definition, is true). Many mathematicians would regard these questions as meaningless.

If numbers are sets, then they must be particular sets, for each set is some particular set. But if the number 3 is really one set rather than another, it must be possible to give some cogent reason for thinking so; for the position that this is an unknowable truth is hardly tenable. (Benacerraf 1965)

Benacerraf imagined a student named Ernie who defined the natural numbers as the von Neumann ordinals 0 = ∅ 0=\emptyset , 1 = { 0 } 1=\{0\} , 2 = { 0 , 1 } 2 = \{0,1\} , 3 = { 0 , 1 , 2 } 3 = \{0,1,2\} , and another named Johnny who defined them as 0 = ∅ 0=\emptyset , 1 = { 0 } 1=\{0\} , 2 = { 1 } 2=\{1\} , 3 = { 2 } 3 = \{2\} . When Ernie and Johnny met, they could not agree on whether 3 ∈ 17 3\in 17 or not. Benacerraf concluded that numbers cannot be sets.

So what matters, really, is not any condition on the objects (that is, on the set) but rather a condition on the relation under which they form a progression…. what is important is not the individuality of each element but the structure which they jointly exhibit…. the question of whether a particular “object”—for example, { { { ∅ } } } \{\{\{\emptyset\}\}\} —would do as a replacement for the number 3 would be pointless in the extreme… the whole system performs the job or nothing does. (Benacerraf 1965)

That is,

To be the number 3 is no more and no less than to be preceded by 2, 1, and possibly 0, and to be followed by 4, 5, and so forth. (Benacerraf 1965)

Some years earlier, Bourbaki had given an important role to the notion of structure in writing their Elements of Mathematics. One can argue that when we look around us at mathematics, what we see everywhere are structures. The natural numbers are just one example. Another is the real numbers (are they “really” Cauchy sequences or Dedekind cuts?). Likewise cartesian products (is an ordered pair ( a , b ) (a,b) “really” { a , { a , b } } \{a, \{a,b\}\} ?). Everywhere what we see are structures like groups, rings, fields, topological spaces, manifolds, etc. defined by equipping one or more sets with functions and relations, and which we only care about determining up to structure-respecting isomorphism.

The study of structure can be “coded” within a set theory such as ZFC, but it remains a “coding”—as Benacerraf pointed out, there is extraneous information in any ZFC-set which must be forgotten about in order to study structure. In Bourbaki’s Theory of Sets they defined a general notion of “structure”: one or more sets together with an element of some set obtained from these by iterating products and power sets, satisfying suitable axioms. For example, a toplogical space would be specified by a single set X X and an element of P ( P ( X ) ) P(P(X)) (the set of open subsets of X X ). A group could be specified by a set G G and an element of G × P ( G × G × G ) G\times P(G\times G\times G) (the identity together with the graph of the multiplication). In their general theory, Bourbaki specifically restricted the allowable axioms for such structures to those which would be invariant under isomorphisms of the carrier sets. In other words, they specifically had to say “now we forget about all those irrelevant bits and remember only what we want.”

There are even ways to make this “forgetting” precise. One way, which I believe has mostly been pursued by philosophers following Benacerraf, is to introduce “structure” as a notion living at a higher level of abstraction than sets. That is, “the natural numbers” are a “structure,” not a “set,” which is a “way of talking at once about all sets that might represent the natural numbers.” I don’t know whether this has been made into a formal mathematical theory.

Another way of doing things structurally in a ZF-like theory is to postulate a global choice operator, meaning an operator ε \varepsilon such that if P P is a property where there exists any x x with P ( x ) P(x) , then ε x . P ( x ) \varepsilon x. P(x) is guaranteed to be such an x x . We can then define, for instance, “the” natural numbers to be ε x . P ( x ) \varepsilon x. P(x) where P ( x ) P(x) is essentially “ x x is a natural numbers object in Set Set ”. All we have to do then is show that there exists some NNO, which we can do with either of Ernie or Johnny’s constructions (or many others, of course). How does this solve the problem? Well, since nothing is assumed about ε x . P ( x ) \varepsilon x. P(x) or its elements except that P P is true of it, we can’t assert any statement like 3 ∈ 17 3\in 17 . However, since ℕ \mathbb{N} is still some particular set, such statements still have a definite truth value (in any model); we can just never discover what it is. So this essentially amounts to resolving the dispute between Ernie and Johnny by having a teacher come over and say “One of you is right and one of you is wrong about whether 3 ∈ 17 3\in 17 , but I’m not going to tell you which is which. All I’m going to tell you is that ℕ \mathbb{N} is a set that satisfies a certain list of properties which both of you can prove from your definitions, so those properties are all you’re ever allowed to use.”

Bourbaki used such a choice operator, as does Arnold Neumaier’s proposed foundation FMathL. Personally, I find this a rather unsatisfying resolution of the issue. To me it makes much more sense to say that 3 ∈ 17 3\in 17 is a meaningless statement than that it has a definite truth value we are merely unable to discover. This is a philosophical point rather than a mathematical one, but of course the whole issue is a philosophical one, once both types of theory are adequate as foundations. I believe Arnold has said that his aimed-at computer implementation of FMathL will recognize such statements as undecidable and warn the user about them, but it seems cleaner to me if they are just a syntax error. A final, and mathematical, point is that assuming a choice operator usually (and unsurprisingly) implies the truth of the Axiom of Choice. Even if you personally have no problem with assuming AC, I think it is unsatisfying for the resolution of a philosophical issue (how to do mathematics structurally) to depend on an extremely strong, and a priori unrelated, set-theoretic axiom.

Now, the partisans of structural or categorical set theory contend that it is equivalent in strength to ZFC-like theories (this much is provable) and just as adequate as a foundation for mathematics, but it does away with all the extra irrevelant information which always has to be forgotten about anyway. I believe the first such set theory was Lawvere’s Elementary Theory of the Category of Sets, which dates to about the same time as Benacerraf. (I should also mention Todd’s very nice exposition of ETCS.) In ETCS, instead of sets with a membership structure ∈ \in as in ZFC, we work with the category of sets, imposing axioms guaranteeing that it has products, exponentials, powersets, and so on. If we then perform only categorical constructions, the results will automatically be isomorphism-invariant. For instance, the natural numbers are characterized as a natural numbers object in the category of sets. The way in which this sort of theory answers Benacerraf’s objections was put forward eloquently by Colin McLarty in “Numbers can be just what they have to”:

…the structuralist program is already fulfilled or obviated, depending on how you look at it, by categorical set theory… Sets and functions in this theory have only structural properties. (McLarty 1993)

McLarty imagines a student named Brittany who has been taught categorical set theory and is quite confused by the difficulties of Ernie and Johnny. In the following exchange, Ernie is trying to describe the philosophers’ approach in which “structure” lives at a higher level of abstraction than “sets.”

… [Brittany] asked “You mean that you define the natural numbers as a certain specific set?” “Well no,” he answered, “The natural numbers aren’t a set, they are a structure. You see they aren’t uniquely defined.”… She asked if it was the same for the real numbers, or the Euclidean plane, and he said it was. He said all of those are abstract structures, handy ways of talking about sets but not themelves sets and actually not objects at all. “So the advantage of your set theory is that mathematicians never work with your sets!” she said amazed.

Brittany, of course, can work with sets as objects of the category Set Set , which can only be characterized up to isomorphism. Thus, for her, nothing can be said about the elements of some NNO other than that they support the structure of an NNO.

The term “categorical set theory” is common for theories such as ETCS, but it has the disadvantage that in logic and philosophy, “categorical” also has the completely different meaning of “uniquely determined.” This has led some people to use “categorial set theory” instead (note the missing “c”). I prefer structural set theory, which among other things stresses the point that such theories do not necessarily depend on category theory. Of course, ETCS is explicitly about the category of sets; the first axioms say that “sets and functions form a category”. But there are equivalent theories in which the notion of category is not taken for granted, and in which such facts as the associativity of function composition are proven rather than assumed as axioms.

I’ve written down one such theory myself as a proof-of-concept, for now called SEAR. It has the additional advantages of containing “elements” of sets as a primitive concept, and not requiring the development of topos theory in order to prove the separation axiom. In SEAR we are given a collection of sets, together with a collection of elements of each set, and for each pair of sets a collection of (binary) relations between those two sets. One axiom allows us to construct a relation R R from A A to B B by specifying precisely for which pairs of elements a ∈ A a\in A and b ∈ B b\in B it is supposed to hold. For example, in this way we can define the composite of two relations R : A → B R\colon A \to B and S : B → C S\colon B \to C such that ( S ∘ R ) ( a , c ) (S\circ R)(a,c) holds just when there exists a b ∈ B b\in B such that R ( a , b ) R(a,b) and S ( b , c ) S(b,c) hold. Likewise we can define a function to be a relation R : A → B R\colon A \to B such that for any a ∈ A a\in A there is a unique b ∈ B b\in B with R ( a , b ) R(a,b) . And we can prove that sets and functions form a category, in fact that they form a category satisfying the axioms of ETCS. Conversely, in an ETCS-category we define an “element” of a set A A to be a function 1 → A 1\to A from the terminal object, and a “relation” to be a subobject R ↪ A × B R\hookrightarrow A\times B , and we can prove that the basic axioms of SEAR are satisfied. (SEAR is by default stronger than ETCS, akin to the distinction between ZF and Z).

SEAR shows that the distinguishing feature of structural set theories is not that they use category theory, but that they can serve as a foundation for structural mathematics without the need to first forget irrelevant information such as whether 3 ∈ 17 3\in 17 . Of course, one cannot get very far in SEAR without defining the category of sets, but one cannot nowadays expect to get very far in most areas of mathematics without defining the category or categories one is working in. I also believe that SEAR is easier for the non-toposophically-inclined to use as a foundational system than ETCS is, since it includes a “comprehension axiom” as basic, rather than having to construct such an axiom out of topos-theoretic structure as in ETCS.

By contrast to structural set theories, I like to refer to ZFC-like set theories as material set theories. The idea is that in a material set theory, the elements of a set A A are “material” and have an existence and identity apart from being an element of A A . The word “material” was suggested by Steve Awodey; its earliest use that I know of is in his paper “Structure in mathematics and logic: a categorical perspective”:

The definition [of a cartesian product] provides a uniform, structural characterization of a product of two objects in terms of their relations to other objects and morphisms in a category, in contrast to ‘material’ set-theoretic definitions which depend on specific and often irrelevant features of the objects invoved, introducing unwanted additional structure. Indeed it is just this material aspect of conventional set theory that gives rise to such pseudo-problems as whether the number 1 1 is ‘really’ the set { ∅ } \{\emptyset\} , or whether the real numbers are ‘really’ cuts in the rationals. (Awodey 1996)

Now, the sets in a material set theory are admittedly closer to the natural-language meaning of “set”: a set of three sheep can be distinguished from a set of three chairs, and each of the sheep and chairs might also be an element of other sets. However, the claim is that the sets in a structural set theory are closer to the way sets are used in mathematics. These “structural sets” are also very similar to the types in a type theory (regarded as the object-theory, as suggested in the previous post). In fact, Toby has convinced me that it’s difficult to decide exactly where to draw the line between type theory and structural set theory, although there are differences in how the words are most commonly used. It might be better, terminologically speaking, if mathematicians had used a word such as “type” instead of “set” all along. But by now the notion that (for instance) a group is a set equipped with an identity and a multiplication is so firmly entrenched in most mathematicians’ consciousnesses that I think there’s little point trying to change it. Anyway, as I mentioned in the previous post, “set” and “type” and “class” are basically fungible words—especially when used structurally.

So what is the relationship between structural and material set theory? One direction is easy: the sets in a material set theory will always, by forgetting the superfluous data, form a structural set theory. In the other direction, one can structurally “build” a model of material set theory by constructing “hereditary membership trees” and calling them “sets”. See nlab:pure set for a summary; more detailed versions can be found in chapter VI of Sheaves in Geometry and Logic, or in chapter 9 of Johnstone’s “Topos Theory.”

Thus, the two types of theory are “equivalent” in a suitable sense, so at least in principle either one can equally well be used as a foundation. However, one should not get the impression from this that the way in which structural set theory serves as a foundation is by first using material set theory as a foundation and then interpreting this using membership trees. The situation is rather the reverse: the way material set theory serves as a foundation for mathematics is by first using structural set theory as a foundation, and then interpreting this in the category of material sets. The complexity of the construction of membership trees really makes it clear, to me at least, how much superfluous data is carried around when we use material set theory as a foundation for mathematics.