What Is a Set In Python

Before learning sets in Python, let's first recall the concept of sets in mathematics. Simply put, the set is "a bunch of certain elements". We can use the Venn Diagram to represent the relationship between sets. For example:

In Python, a set is an unordered sequence of elements, and each element is unique and must be immutable (which cannot be changed).

However, the set itself is mutable. In other words, we can add and remove items in a set. Python also provides a lot of built-in methods to manipulate sets, we will learn these methods later.

Like the set in mathematics, the set in python can perform operations such as union and intersection.

Note: Sets are unordered, so the items will appear in a random order.

How To Define And Create a Set

There are 2 ways to create a set in python.

The first way is placing all the items inside curly braces, separated by comma, like this:

#create a color set color_set = {"red", "green", "red", "blue"} print(color_set)

The other way is to use built-in function set() .

#create a color set color_set = set(["red", "green", "red", "blue"]) print(color_set)

The output of the above two methods is as follows:

{"red", "green", "blue"}

You will notice that the duplicate element "red" was removed after the set was created. This means that every element in the set must be unique.

Elements in set may be of different types, like integer, float, tuple, string etc. But a set cannot have a mutable element, like list, set or dictionary, as its element. So we can create a mixed set as follows:

mixed_set = {1.0, "Black", (1, 2, 3)} print(mixed_set)

There is a small trap when creating an empty set. You must use set() instead of { } to Create an empty set, because { } is used to create an empty dictionary in Python. Here is a simple example to prove this conclusion:

# initialize a with {} a = {} # check data type of a # Output: <class 'dict'> print(type(a)) # initialize a with set() a = set() # check data type of a # Output: <class 'set'> print(type(a))

How To Access Items From a Set

Because the items in the set are unordered, we can't use the numeric index to access the items in the set. But we can traverse and get the elements in the set through the for loop. Let's print all items in color_set :

color_set = {"red", "green", "blue"} for x in color_set: print(x)

The len() function returns the number of items in a set:

color_set = {"red", "green", "blue"} #Output: 3 print(len(color_set))

We can also use the in keyword to determine whether the specified item is in the set.

color_set = {"red", "green", "blue"} print("green" in color_set)

How To Change a Set

As we mentioned at the beginning of the tutorial, the items in the set are immutable. You cannot change its items, but you can add new items.

Add Items

We can use the add() method to add single item to a set, while use the update() method to add multiple items. For example:

color_set = {"red", "green", "blue"} #add single item to a set color_set.add("white") #Output: {"red", "green", "white", "blue"} print(color_set) #add multiple items to a set color_set.update(["orange", "yellow", "gray", "red"]) #Output: {"red", "green", "white", "orange", "yellow", "gray", "blue"} print(color_set)

In the above example, the argument to the update() method is lists . You can also use tuples , strings , or other sets as arguments to the update() method.

Note: Whether you use the add() method or the update() method, the newly added elements are not necessarily at the beginning or end of the original set, and there is no order. And the duplicate elements are automatically removed.

Remove Items

There are two main ways for removing a item from a set: the discard() method and the remove() method. The other aspects of the two methods are the same, the only difference is that: If the removed item does not exist, then using the remove() method will raise an exception, but using discard() will not. Here are some examples:

color_set = {"red", "green", "blue"} print(color_set) # discard an element # Output: {"green", "blue"} color_set.discard("red") print(color_set) # remove an element # Output: {"green"} color_set.remove('blue') print(color_set) # discard an element not present in color_set # Output: {"green"} color_set.discard("red") print(color_set) # remove an element not present in color_set # If you uncomment line 27, # you will get an error. # Output: KeyError: 2 color_set.remove("red")

We can also use the pop() method to remove an item in the set. But since the set is unordered, you can't determine which item was removed, it's arbitrary. The return value of the pop() method is the removed item. It will raise an exception if the set is empty.

color_set = {"red", "green", "blue"} #Output: 'green' print(color_set.pop()) #Output: {"red", "blue"} print(color_set) #Output: 'red' print(color_set.pop()) #Output: {"blue"} print(color_set) #Output: 'blue' print(color_set.pop()) #Output: set() print(color_set) print(color_set.pop()) Traceback (most recent call last): File "<pyshell#82>", line 1, in <module> color_set.pop() KeyError: 'pop from an empty set'

We can use clear() method to remove all items from a set:

color_set = {"red", "green", "blue"} color_set.clear() #Output: set() print(color_set)

Finally, we can use the del keyword to destroy the entire set:

color_set = {"red", "green", "blue"} del color_set print(color_set)

It will raise an exception:

Traceback (most recent call last): File "demo_set_del.py", line 5, in <module> print(color_set) #this will raise an error because the set no longer exists NameError: name 'color_set' is not defined

Python Set Operations

Sets can be used to carry out mathematical set operations like union, intersection, difference and symmetric difference. Some operations are performed by operators, some by methods, and some by both.

set.union(set1[, set2, ...])

The union() method or | operator in python can combine many sets(set1, set2, ...) into a single set. If an item is present in more than one set, the result will contain only one appearance of this item.

s1 = {"a", "b", "c"} s2 = {"f", "d", "a"} result = s1.union(s2) # or result = s1 | s2 print(result)

The output is as follows:

{'f', 'b', 'c', 'a', 'd'}

And we can use Venn Diagram to represent the union result:

set.intersection(set1[, set2 ...])

The intersection() method or & operator can compute the intersection of two or more sets. Also the above two sets s1 and s2:

s1 = {"a", "b", "c"} s2 = {"f", "d", "a"} result = s1.intersection(s2) # or result = s1 & s2 print(result)

The output is as follows:

{"a"}

And we can use Venn Diagram to represent the intersection result:

set.difference(set1[, set2 ...])

The difference() method or - operator can compute the difference between two or more sets. Difference of s1 and s2 ( s1 - s2 ) is a set of elements that are only in s1 but not in s2 . Similarly, s2 - s1 is a set of element in s2 but not in s1 . For example:

s1 = {"a", "b", "c"} s2 = {"f", "d", "a"} result = s1.difference(s2) # or result = s1 - s2 print(result)

The output is as follows:

{"b", "c"}

And we can use Venn Diagram to represent the difference result:

You can specify more than two sets:

>>> a = {1, 2, 3, 30, 300} >>> b = {10, 20, 30, 40} >>> c = {100, 200, 300, 400} >>> a.difference(b, c) {1, 2, 3} >>> a - b - c {1, 2, 3}

When multiple sets are specified, the operation is performed from left to right. In the example above, a - b is computed first, resulting in {1, 2, 3, 300} . Then c is subtracted from that set, leaving {1, 2, 3} :

set.symmetric_difference(set1)

The symmetric_difference() method or ^ operator can compute the symmetric difference between sets. Symmetric Difference of s1 and s2 is a set of elements in both s1 and s2 except those that are common in both.

s1 = {"a", "b", "c"} s2 = {"f", "d", "a"} result = s1.symmetric_difference(s2) # or result = s1 ^ s2 print(result)

The output is as follows:

{"b", "c", "f", "d"}

Note: The ^ operator also allows more than two sets: >>> a = {1, 2, 3, 4, 5} >>> b = {10, 2, 3, 4, 50} >>> c = {1, 50, 100} >>> a ^ b ^ c {100, 5, 10} Unlike the difference() method, the symmetric_difference() method doesn't support: >>> a = {1, 2, 3, 4, 5} >>> b = {10, 2, 3, 4, 50} >>> c = {1, 50, 100} >>> a.symmetric_difference(b, c) Traceback (most recent call last): File "<pyshell#11>", line 1, in <module> a.symmetric_difference(b, c) TypeError: symmetric_difference() takes exactly one argument (2 given)

There are other methods in Python that manipulate set, such as isdisjoint(), issubset(), issuperset(), and so on. You can refer to the latest python documentation, which is not covered here.

Python Frozenset

Frozenset is a built-in type that has the characteristics of a set, but unlike set, frozenset is immutable, its elements cannot be changed once assigned. That is, frozensets are immutable sets.

Frozensets can be created using the function frozenset() .

# initialize A and B A = frozenset([1, 2, 3, 4]) B = frozenset([3, 4, 5, 6])

Frozenset supports all set methods except the add(), remove(), pop(), clear() methods which will modify a set.

color_set = {"red", "green", "blue"} >>> color_set = frozenset(["red", "green", "blue"]) >>> color_set.add('black') Traceback (most recent call last): File "<pyshell#127>", line 1, in <module> color_set.add('black') AttributeError: 'frozenset' object has no attribute 'add' >>> color_set.pop() Traceback (most recent call last): File "<pyshell#129>", line 1, in <module> color_set.pop() AttributeError: 'frozenset' object has no attribute 'pop' >>> color_set.clear() Traceback (most recent call last): File "<pyshell#131>", line 1, in <module> color_set.clear() AttributeError: 'frozenset' object has no attribute 'clear' >>> color_set frozenset({"red", "green", "blue"})

As we know, the elements in sets must be immutable. We can't define a set whose elements are also sets.

s1 = set(['red']) s2 = set(['green']) s3 = set(['blue']) s = {s1, s2, s3} Traceback (most recent call last): File "<pyshell#38>", line 1, in <module> s = {s1, s2, s3} TypeError: unhashable type: 'set'

But s1, s2 and s3 are frozensets, it will not raise an exception, because the frozenset is immutable.

s1 = frozenset(['red']) s2 = frozenset(['green']) s3 = frozenset(['blue']) s = {s1, s2, s3} #Output: {frozenset({'red'}), frozenset({'green'}), frozenset({'blue'})} print(s)

It is also because of its immutable feature, it can be used as the dictionary key.

s1 = frozenset(['red']) s2 = frozenset(['green']) d = {s1: 'bgcolor', s2: 'forecolor'} #Output: {frozenset({'red'}): 'bgcolor', frozenset({'green'}): 'forecolor'} print(d)

This tutorial summarizes most of the usage of sets in Python. If you are not yet familiar with it, you can read the tutorial several times and practice a few more examples.