There’s been a bunch of discussion online recently (e.g. SBS, MathOverflow 1, 2, 3, etc.) about set-theoretic foundations for category theory, the role of universes, and so on. So I thought I would take this opportunity to advertise a set theory which isn’t that well-known, but which I find particularly pleasing as a foundation for category theory. I’ll call it strong Feferman set theory, since it is a stronger variant of a theory proposed by Solomon Feferman in his paper “Set-theoretical foundations of category theory.” Feferman called his theory ZFC/S (“ZFC with smallness”), and thus strong Feferman set theory could reasonably be called “ZMC/S,” for reasons explained below. (Feferman called the strong variant “ZFC/S + RIn(S)”.)

I’m going to assume some familiarity with the set-theoretic issues besetting category theory, and also some exposure to the idea of (Grothendieck) universes. To summarize the application of the latter to category theory, the first idea we might come up with is that if we have a universe U U , we can treat the mathematical objects in U U as the “small” ones, so that then the category Set = Set U Set=Set_U of all small sets is, itself, a set (just not a small one). Thus, we can manipulate “large” categories with impunity, since they are still just sets, rather than mysterious and ill-behaved “proper classes.” In particular, one can form functor categories B A B^A when A A and B B are large, which most theories of classes don’t allow.

However, what if at some later date we decide there’s something not in U U that we want to treat as an ordinary “small” mathematical object? For instance, it might be a category C C that’s large relative to U U . We’d have to hope there’s another, bigger universe U ′ U' that contains C C , so we could switch to working with U ′ U' as our universe. To ensure that such switches are always possible, Grothendieck proposed the axiom that “every set is contained in some universe.”

Why might one be dissatisfied with universes? Well, one thing is that even the existence of one universe is unprovable in ZFC (assuming it is consistent), and Grothendieck’s axiom is noticeably stronger than that. However, it is also very weak compared to many large-cardinal axioms, so let’s lay aside that question. I think a more important philosophical objection, though perhaps rarely voiced, is that many mathematicians are accustomed to thinking of Set Set (for instance) as the category of all sets, and it’s disconcerting to be told that actually, you’re only allowed to include small sets in it.

That philosophical unease actually also has a purely mathematical manifestation. Suppose I want to prove something about (say) all groups, and I do it by considering some arbitrary group as an object of the category Grp Grp . In the universe framework, I haven’t actually proven something about all groups; I’ve proven something about all U U -small groups. However, presumably my proof didn’t use any properties of U U other than that it was a universe, and so for any group G G whatsoever, I could choose (assuming Grothendieck’s axiom) a universe containing G G and repeat the proof using that universe as my U U . In other words, all of our theorems come with an implicit quantifier “for any universe U U ,…” stuck on the front.

But wait! It’s not just that the statement I’ve proven only applies to U U -small groups; any quantifiers in the statement that run over “all groups” are also restricted to U U -small groups. For instance, consider the universal property of the cartesian product of groups G × H G\times H , which says that for any group K K and any homomorphisms ϕ : K → G \phi\colon K\to G and ψ : K → H \psi\colon K\to H , there exists a unique homomorphism χ : K → G × H \chi\colon K\to G\times H such that π 1 χ = ϕ \pi_1\chi = \phi and π 2 χ = ψ \pi_2\chi = \psi . Now suppose that, through clever category theory, I’ve managed to prove the statement that “ G × H G\times H is a product of G G and H H in the category Grp Grp ”. In the world of ordinary set theory, where Grp Grp is the category of all groups, I’m done! But in the world of universes, where Grp Grp is the category of U U -small groups, I’ve only shown that the universal property holds for U U -small groups K K .

Now, as before, I could presumably repeat my proof for any universe U U , thereby showing that the universal property actually holds for all groups K K (since, by Grothendieck’s axiom, any such K K is contained in some universe). However, in a more complicated situation, the construction of G × H G\times H (or some other object having a putative universal property) might, in theory at least, depend on U U . This isn’t as far-fetched as it might sound; maybe we have some object constructed by an appeal to the adjoint functor theorem, which depends on size considerations for the entire category Grp Grp , and thereby on the universe U U . So if I really want a universal property that is true for all groups K K , I have to do some careful analysis of my construction and make sure that it will be preserved by “passage to a higher universe.”

(Strong) Feferman set theory remedies this as follows. To ZFC (or your favorite set theory) we add a constant symbol U U together with the axiom “ U U is a universe,” and also an axiom schema stating that for any statement φ \varphi , all of whose parameters are in U U but which does not mention U U explicitly, we have φ U ⇔ φ \varphi^U \iff \varphi . Here φ U \varphi^U denotes φ \varphi relativized to U U , meaning that all of its quantifiers are constrained to run only over elements of U U . (In other words, U U is an elementary substructure of the class of all sets.) This is called a reflection principle. Intuitively, you can think of it as saying “large objects act the same as small ones.”

Some examples should help clarify what this means and how it is used. We define a small set to be an element of U U , and likewise a small group and so on. Notice that U U is now fixed and doesn’t vary. We define Set Set to be the category of all small sets, and likewise Grp Grp . And as in the ordinary setup of universes, these large categories are still sets, so there is no problem doing anything with them that we might like.

Now suppose we know that for some small groups G G and H H , some other small group G × H G\times H is a product in the category Grp Grp . That means that the following statement is true:

for all small groups K K , for any homomorphisms ϕ : K → G \phi\colon K\to G and ψ : K → H \psi\colon K\to H , there exists a unique homomorphism χ : K → G × H \chi\colon K\to G\times H such that π 1 χ = ϕ \pi_1\chi = \phi and π 2 χ = ψ \pi_2\chi = \psi .

Now we observe that this statement is the relativization to U U of the following statement:

for all groups K K , for any homomorphisms ϕ : K → G \phi\colon K\to G and ψ : K → H \psi\colon K\to H , there exists a unique homomorphism χ : K → G × H \chi\colon K\to G\times H such that π 1 χ = ϕ \pi_1\chi = \phi and π 2 χ = ψ \pi_2\chi = \psi .

That is, to get from the second to the first we replaced the quantifier “for all groups” with a quantifier ranging over small things: “for all small groups.” (Strictly speaking, we should have replaced the quantifiers over homomorphisms as well—that is, we should have written in the first statement “for any small homomorphisms” and “there exists a unique small homomorphism.” However, properties of a universe ensure that any function between small sets is always small, so this is unnecessary.) Moreover, all the parameters of the second statement ( G G , H H , and G × H G\times H —everything that’s not quantified over) are in U U , but the statement doesn’t mention U U explicitly. Thus, the reflection schema applies, and tells us that the second statement follows from the first. In other words, after proving a universal property in the category Grp Grp of small groups, we can automatically conclude that the same universal property holds—for the same product G × H G\times H —relative to all groups.

Okay, well, what about cartesian products of large groups? We can apply the reflection schema again. Assuming our construction of products in Grp Grp was sufficiently general, we’ve actually proven that

for all small groups G G and H H , there exists a small group G × H G\times H and homomorphisms π 1 : G × H → G \pi_1\colon G\times H\to G , π 2 : G × H → H \pi_2\colon G\times H \to H , such that for all small groups K K , for any homomorphisms ϕ : K → G \phi\colon K\to G and ψ : K → H \psi\colon K\to H , there exists a unique homomorphism χ : K → G × H \chi\colon K\to G\times H such that π 1 χ = ϕ \pi_1\chi = \phi and π 2 χ = ψ \pi_2\chi = \psi .

But this is the relativization of the statement

for all groups G G and H H , there exists a group G × H G\times H and homomorphisms π 1 : G × H → G \pi_1\colon G\times H\to G , π 2 : G × H → H \pi_2\colon G\times H \to H , such that for all groups K K , for any homomorphisms ϕ : K → G \phi\colon K\to G and ψ : K → H \psi\colon K\to H , there exists a unique homomorphism χ : K → G × H \chi\colon K\to G\times H such that π 1 χ = ϕ \pi_1\chi = \phi and π 2 χ = ψ \pi_2\chi = \psi .

and therefore the latter is also true. In this way, we can constrain ourselves to work with small things whenever necessary, so that our large categories are (sets and hence) well-behaved, but then turn around and apply the reflection schema to expand all of our conclusions to arbitrary objects, not just small ones. Or, in other words, we can use large categories with impunity to prove things about the category Set Set of small sets, but then when we are done, anything we’ve proven about Set Set (of a suitable logical form) will also be true about the category SET SET of all sets (which is a “classical” large category, i.e. a proper class). This deals nicely with the philosophical objection to universes: though we may care about the category of all sets, we can restrict ourselves to a category of “small” sets without changing the content of our theorems, because of the reflection schema.

Let me end with some remarks on the strength of strong Feferman set theory. First notice that it can prove Grothendieck’s axiom of universes. For suppose that x x is a small set. Then since U U is a universe, the statement “there exists a universe containing x x ” is true. By reflection, it is also true that “there exists a small universe containing x x .” But since x x was an arbitrary small set, we have shown that “for all small sets x x , there exists a small universe containing x x .” Now by reflection again in the other direction, it must also be true that “for all sets x x , there exists a universe containing x x .” In particular, U U itself is contained in a universe, which is contained in another universe, and so on. But since every small set is also contained in a small universe, there are plenty of universes below U U as well. So we have a plethora of universes—but of all of them, only U U is distinguished by the reflection schema.

In fact, strong Feferman set theory is equiconsistent with ZMC, which means ZFC together with the assumption that Ord Ord , the class of ordinals, behaves like a Mahlo cardinal. This is notably stronger than Grothendieck’s axiom, but still not very strong as large-cardinal principles go, and I think most set theorists would agree that it is almost as sure to be consistent as ZFC itself.