Over a million developers have joined DZone. {{announcement.body}}
{{announcement.title}}

An Introduction to Python Sets

DZone 's Guide to

An Introduction to Python Sets

Learn about how to create sets using the brace notation, about set constructors, and about the various commonly used operations with sets.

· Big Data Zone ·
Free Resource

Comment (1)

Save
{{ articles.views | formatCount}} Views

Python supports sets, which are a collection of unique elements and provide operations for computing set unions, intersections, and differences.

Introduction

A set is a collection of unique elements. A common use is to eliminate duplicate elements from a list. In addition, it supports set operations like union intersection and difference.

Creating a Set

This involves brace construction and set comprehension.

Brace Construction

Creating a set looks similar to creating a dictionary; you enclose a bunch of items within braces.

``````s = {1, 2, 3, 3, 4}
print s
# prints
set([1, 2, 3, 4])``````

Notice that the `set` contains unique elements only even though we put duplicates into it.

A set need not contain elements of the same type. You can mix and match element types as you like.

``````set = {1, 2, 'hello', 4, 'world'}
print set
# prints
set(['world', 2, 4, 'hello', 1])``````

Set Comprehension

Similar to dictionaries and lists, you can use set comprehension as in the following example of a set of squares.

``````a = {x*x for x in xrange(10)}
print a
# prints
set([0, 1, 4, 81, 64, 9, 16, 49, 25, 36])``````

Using the `set()` Constructor

Create a set from a list using the `set()` constructor.

``````a = [1, 2, 2, 3]
print set(a)
# prints
set([1, 2, 3])``````

How about creating a set of characters comprising a string? This shortcut will work.

``````print set('abcd')
# prints
set(['a', 'c', 'b', 'd'])``````

Creating a set of unique random numbers:

``````a = [random.randint(0, 10) for x in xrange(10)]
print a
print set(a)
# prints
[10, 2, 3, 3, 6, 6, 4, 9, 5, 0]
set([0, 2, 3, 4, 5, 6, 9, 10])``````

Methods of `set`

The following sections explain the most commonly used methods of sets.

Membership Testing

The boolean expressions `elem in a` and `elem not in a` allow checking for membership of a set.

``````a = {'apple', 'orange', 'banana', 'melon', 'mango'}
print a
print 'banana' in a
print 'papaya' in a
# prints
set(['melon', 'orange', 'mango', 'banana', 'apple'])
True
False``````

Set Size

You can obtain the size of a set (the number of elements) using the `len()` function.

``````a = {'apple', 'orange', 'banana', 'melon', 'mango'}
print a
print 'size of a:', len(a)
# prints
set(['melon', 'orange', 'mango', 'banana', 'apple'])
size of a: 5``````

Use the `add()` method to add an element to the set. If the element does not exist, it is added. No errors are raised if the element does exist, though.

``````a = [random.randint(0, 10) for x in xrange(10)]
print 'list =>', a
s = set(a)
print 'set =>', s
# prints
list => [3, 4, 7, 2, 8, 0, 4, 1, 0, 4]
set => set([0, 1, 2, 3, 4, 7, 8])
after add => set([0, 1, 2, 3, 4, 7, 8, 10])``````

You will need to use a loop to add multiple elements since the `add()` method accepts only a single argument.

You cannot add a `list` to a `set` since the list cannot be hashed.

``````s.add(10)
print s
# prints
TypeErrorTraceback (most recent call last)
in ()
6 print 'after add =>', s
8 print s

TypeError: unhashable type: 'list'``````

However, a `tuple` can be added since it is not mutable and hence hashable.

``````s.add((21, 22))
print s
# prints
set([0, 3, 4, 5, 6, 7, (21, 22), 9, 10])``````

Removing Elements from a Set

Remove a single element from a set using `remove()`.

``````a = [random.randint(0, 10) for x in xrange(10)]
print 'list =>', a
s = set(a)
print 'set =>', s
s.remove(10)
print 'after remove =>', s
# prints
list => [6, 6, 7, 6, 7, 5, 10, 3, 8, 3]
set => set([3, 5, 6, 7, 8, 10])
after remove => set([3, 5, 6, 7, 8])``````

A `KeyError` is raised if the element is not in the set. (Running the same code as above a couple of times generates a random sequence without `10` in the set.)

``````# prints
list => [0, 4, 4, 4, 6, 6, 9, 5, 9, 6]
set => set([0, 9, 4, 5, 6])

KeyErrorTraceback (most recent call last)
in ()
3 s = set(a)
4 print 'set =>', s
----> 5 s.remove(10)
6 print 'after remove =>', s

KeyError: 10``````

Need to remove an element from a set without the pesky `KeyError`? Use `discard()`.

``````print s
s.remove(0)
print s
# prints
set([0, 2, 3, 4, 8, 9])
set([2, 3, 4, 8, 9])``````

Remove all elements from a set? Use `clear()`.

``````print s
s.clear()
print s
# prints
set([2, 3, 4, 8, 9])
set([])``````

Set Operations

Let's now learn about set operations supported by a `set`.

Disjoint Sets

A set is disjoint with another set if the two have no common elements. The method `isdisjoint()` returns `True` or `False` as appropriate.

``````print set([0, 3, 6]).isdisjoint(set([9, 10, 5, 7]))
# prints True``````

Another example:

``````print set([0, 1, 2, 3, 4]).isdisjoint(set([8, 1]))
# prints False``````

Checking for Subset and Superset

Check whether all elements of a set are contained in another set using the `issubset()` method. You can also use the boolean form `setA <= setB`.

Using the form `setA < setB` checks for `setA` being a proper subset of `setB` (that is `setB` containing all elements from setA and then some more).

Need to check for a superset? Use `issuperset()` or `setA >= setB` or `setA > setB` for a proper superset.

``````a = set([1, 3, 4, 5])
b = set([1, 3, 4, 5])
c = set([1, 3, 4, 5, 6, 7])
print 'a = ', a
print 'b = ', b
print 'c = ', c
print 'a <= b', a <= b
print 'a < b', a < b
print 'issubset', a.issubset(b)
print 'a < c', a < c
# prints
a =  set([1, 3, 4, 5])
b =  set([1, 3, 4, 5])
c =  set([1, 3, 4, 5, 6, 7])
a <= b True
a < b False
issubset True
a < c True``````

Set Union

Compute the union of two or more sets using the `union()` method. A new set containing all elements of all sets is returned.

You can also use the pipe operator (`|`) as shown below.

``````a = set([1, 2, 3])
b = set([3, 4, 5, 6])
c = set(list('abcd'))
print a.union(b, c)
print a | b | c
# prints
set(['a', 1, 2, 3, 4, 5, 6, 'b', 'c', 'd'])
set(['a', 1, 2, 3, 4, 5, 6, 'b', 'c', 'd'])``````

Set Intersection

How about identifying elements common to two or more sets? Use the `intersection()` method or the `&` operator.

``````print a & b
print a & b & c
# prints
set()
set([])``````

Set Difference

Set difference returns a new set containing all elements in the argument set that are not in the other sets.

``````print a - b
print a - b - set()
# prints
set([1, 2])
set()``````

Iterating Over Sets

There are several ways of iterating over sets, most common ones are presented here.

• A set is an iterable and hence can be used in a `for` loop for iterating over the elements.
``````a = set([random.randint(0, 10) for _ in xrange(10)])
print a
for x in a:
print x
# prints
set([0, 2, 5, 7, 8, 9])
0
2
5
7
8
9``````
• The ever-present `enumerate()` function is available, which returns a tuple of loop index and the element. Note that the loop index does not have any correlation to the set; in other words, a set does not have a concept of any ordering, so the index is not an index into the set. It is just a loop counter.
``````for i, v in enumerate(a):
print i, v
# prints
0 1
1 2
2 3
3 4
4 5
5 6
6 8``````

Conclusion

And that’s it for now with sets. We learned how to create sets using the brace notation as well as the `set` constructors. Next up were the various commonly used operations with sets.

Topics:
big data ,python ,python sets ,tutorial

Comment (1)

Save
{{ articles.views | formatCount}} Views

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.