
The Paradox That Broke Set Theory
Set theory is often considered to be the foundational field of mathematics. It relies on a few very basic axioms and deals with structures called “sets” and actions involving them. In fact, nearly every area of math relies on set theory somehow for making definitions. Set theory has gone through multiple iterations. The initial version has since been named “naive set theory” due to the variety of problems that it had, including the paradox discussed in this article. Mathematicians eventually created the Zermelo–Fraenkel version of set theory (ZFC) around 1910 to resolve these problems, and this version is still used widely today.
In this article, I am going to walk you through the basics of set theory that should be accessible to a non-mathematician. I will then give the famous paradox that seemed to undermine everything and how it was resolved. As it turns out, set theory is weird! Luckily, this paradox is surprisingly simple to understand!

Naive Set Theory
Mathematicians have long sought to form groups of objects and establish relationships between these groups. In fact, this is one of the central goals of math! However, this process was largely informal for most of the history of math. It was only in the 1870s that set theory was first beginning to be formalized by George Cantor.
Even then, this early version of set theory was largely based on something called natural language, which is the everyday meaning of words used by humans. This makes it easier to understand but can lead to problems in the details. That is why this version of set theory is called “naive.” Many of the concepts in naive set theory come from a natural everyday language that lacks mathematical rigor.
To talk about set theory, we first need to define what a set is. A set is a well-defined collection of objects. These objects are called the elements of a set. There are a few ways to define a set. One of them is to simply list the items contained in the set. For example, the set depicted in the picture above shows a set equal to S = {A, B, C, D}. Note that the order of objects in a set does not matter, it would be equivalent to say S = {B, A, C, D}. There also cannot be any repeats of an object, so S = {B, A, C, D, D, D} = {B, A, C, D}.
The elements contained in S could be anything! They can be numbers, functions, and even other sets. This brings us to the other way to define a set: by a rule. Given some property, we can make a set containing all objects that satisfy that property. For example, I can say that set D is the set of all natural numbers which are divisible by two. So, D = {2, 4, 6, 8, …}. This set is infinite, and we do not need to list every element in it explicitly.
There are some hidden assumptions within naive set theory. This is the problem with the “loose” way of giving a definition that often comes about by using informal language. I am going to list one of these assumptions which will be crucial later.
- For every rule, there exists a set that contains only every object satisfying that rule.
This assumption seems fairly harmless. The definition we gave for D relied on this statement to make sure that the set could be created. If there are no objects satisfying the stated rule, then we end up with an empty set. We call this assumption unrestricted comprehension. Be warned! Trouble is just around the corner!
Russel’s Paradox

Earlier I said that sets can contain other sets. Let’s think about how we could use rules to define sets with this property. What about the rule “a set of all non-even numbers.” This set would have numbers like 3 and 7 in it. It will also have all functions and sets because they are not even numbers. Clearly, this set will be infinite. Oddly enough, this set will also contain itself. This creates a weird situation of self-reference, but it is not a paradox.
Anytime you have a situation involving self-reference, there is bound to be a sneaky paradox hiding around. In this case, we can come up with a carefully crafted rule designed to create a paradox.
Define the set R as the set of all sets which are not a member of themselves.

Nothing in this rule violates the basic ideas of set theory, but it creates an unresolvable situation. Is R a member of itself? If it is, then it violates its own defining rule and must be removed from the set. But then, it agrees with the rule and must be a member of itself. Like the snake pictured at the top of the article, the set R seems to eat itself and cannot be determined. This is the core of Russell’s Paradox.
This paradox can take many other forms that are more well-known. For example, I first heard about it as the “barber paradox.”
There is a barber who lives on an island. The barber shaves all those men who live on the island who do not shave themselves, and only those men.
The paradox arises when trying to determine if the barber shaves himself or not. Believe it or not, this paradox even shows up in the Bible!
One of Crete’s own prophets has said it: “Cretans are always liars, evil brutes, lazy gluttons.” — Titus 1:12 ESV
So is this prophet of Crete lying? This is called the Epimenides Paradox and belongs to a family of paradoxes called the Liar Paradox. While all of these statements take on slightly different forms, they all arise due to problems will self-reference.

Even More Paradoxes
It turns out, there are even more paradoxes in set theory that arise due to a similar situation. They weren’t as historically as important as Russell’s Paradox described above, but I still think they will be interesting to go over. In this section, I am going to give a brief description of another paradox in naive set theory: Cantor’s Paradox. Just a heads up, the math will become a bit more complex here. If you aren’t interested, feel free to skip ahead to the solution.

Cantor’s Paradox
This one takes a few more definitions to understand, but is an interesting paradox. First, I will define a power set of a set. A power set of S, P(S), is the set of all subsets of S. A subset of S is just any set that only contains elements also in S. For example, if S = {1,2}, then P(S) = {{}, {1}, {2}, {1,2}}. Notice that the empty set is included, as it is also a subset of S. You also need to know about the size of a set, but this one is easy. The notation |S| just tells us how many elements are in a set. So |S| = 2, and |P(S)| = 4.
George Cantor (the namesake of this publication!) proved a very straightforward theorem which states that
|P(S)| > |S|
This is a very intuitive theorem, especially given the previous example. A power set is always going to be larger than the set itself.
However, consider a set U which is the set containing all other sets. Then the power set of U, P(U), has elements entirely contained within U! This violates Cantor’s theorem described above.
The Solution
There are multiple ways to solve these paradoxes. We could try and change the language of set theory itself. This approach was taken by Russel in an attempt to resolve the issue. With mathematician Alfred Whitehead, Russel sought to lay out a complete and consistent framework for math in a series of books called Principia Mathematica. This was a massive undertaking, and it resulted in some very strange notation.

As you might have guessed, this solution was not received very well. For one, it used a very odd method of proof and logic which is much harder than the notation we are familiar with. Their original goal was also disproven by Gödel’s incompleteness theorems in 1931. These theorems are complex, but the relevant details show that the goal of Principia Mathematica is impossible. There can never be a mathematical framework that is both complete and consistent.
While the books initiated a much greater conversation about mathematical foundations, they did not have the widespread impact that the authors intended.
So, how was set theory saved from these various paradoxes? It turns out, there is a much easier way to work around this that does not require an entire rewriting of mathematics. Instead, let’s look back at some of the axioms that are hidden in the foundation of set theory, one in particular.
Earlier, I talked about one of the axioms involved in naive set theory called unrestricted comprehension. It basically claims that if there is a well-defined rule, then we can create a set of objects that satisfy this rule (even if it is empty). It turns out that this axiom is the source of the problem. There must be a way to distinguish between rules that are fine and rules that create problems like Russell’s Paradox.
ZFC
I mentioned this at the beginning of the article, but the standard way to resolve these paradoxes is to become more serious about our definitions.
To deal with Russell’s Paradox, mathematicians Ernst Zermelo (Z) and Abraham Fraenkel (F) worked to create a new system. They laid out a series of nine axioms that serve as the foundations for their version of set theory. I am not going to go through each of these axioms (check out the links below for that), but I am going to cover the new version of unrestricted comprehension. This new version is intuitively called the restricted comprehension (or specification) axiom.
Instead of just taking any rule and making a set out of it, we now need to restrict ourselves a bit. In ZFC, you are allowed to make a set out of a rule and elements of another set! You must have a starting set, A, then define a new set, B, which only contains elements of A that satisfy this rule.
This is still very flexible and allows us to define all kinds of sets. Using other axioms of ZFC, we can create the set of all integers, real numbers, and complex numbers. We can then use the axiom of restricted comprehension to narrow down these larger sets and make something like the set of all even numbers.
ZFC has been very carefully crafted to avoid “weird” sets like the one used to create Russell’s Paradox, but it still keeps all the other sets. That is what makes it so powerful!
Note that I am calling this version of set theory ZFC, which is not technically true. ZFC is a combination of Zermelo-Fraenkel set theory and the axiom of choice (C). I have not discussed the axiom of choice in this article, but ZFC is by far the most common version of set theory in use today.
ZFC is not the final answer. It also results in some weird conclusions that go against common sense. The craziest one of these is the Banach–Tarski paradox which lets you mathematically take apart a geometrical ball, then reassemble it into two identical balls (see the video linked below for a full explanation). Clearly, we have more work to do!
Going Further
I hope you learned something! Set theory is truly inseparable from modern mathematics. Its history shows the tumultuous nature of math and how different figures grappled with mathematical truths. Russell’s paradox is just one of many paradoxes.
- If you want a philosophical overview of set theory and the problems created by Russel’s Paradox, then I recommend the page by the Stanford Encyclopedia of Philosophy about it. The same site also has a great overview of the Liar Paradox.
- To really dive into set theory and the idea discussed here, A Book of Set Theory by Charles Pinter is wonderful and approachable (also free!)
- The website Brilliant has a great overview of what ZFC is, along with its advantages and disadvantages. There are some nice visual examples as well.
- I like this page for a brief but informative overview of the differences between naive set theory and axiomatized set theory.
- If you like learning with videos, I found this video to be a very understandable explanation of Russel’s Paradox (it is fairly long)
- I absolutely love this Vsauce video about the Banach-Tarski paradox!
If you liked this article, then consider clapping for it! You also may also want to follow me for more stories like this, or subscribe to my email list! I publish weekly about math and science.





