8.7.sets — Unordered collections of unique elements

New in version 2.3.

Deprecated since version 2.6:The built-inset/frozenset types replace this module.

Thesets module provides classes for constructing and manipulatingunordered collections of unique elements. Common uses include membershiptesting, removing duplicates from a sequence, and computing standard mathoperations on sets such as intersection, union, difference, and symmetricdifference.

Like other collections, sets supportxinset,len(set), andforxinset. Being an unordered collection, sets do not record element position ororder of insertion. Accordingly, sets do not support indexing, slicing, orother sequence-like behavior.

Most set applications use theSet class which provides every set methodexcept for__hash__(). For advanced applications requiring a hash method,theImmutableSet class adds a__hash__() method but omits methodswhich alter the contents of the set. BothSet andImmutableSetderive fromBaseSet, an abstract class useful for determining whethersomething is a set:isinstance(obj,BaseSet).

The set classes are implemented using dictionaries. Accordingly, therequirements for set elements are the same as those for dictionary keys; namely,that the element defines both__eq__() and__hash__(). As a result,sets cannot contain mutable elements such as lists or dictionaries. However,they can contain immutable collections such as tuples or instances ofImmutableSet. For convenience in implementing sets of sets, inner setsare automatically converted to immutable form, for example,Set([Set(['dog'])]) is transformed toSet([ImmutableSet(['dog'])]).

classsets.Set([iterable])

Constructs a new emptySet object. If the optionaliterableparameter is supplied, updates the set with elements obtained from iteration.All of the elements initerable should be immutable or be transformable to animmutable using the protocol described in sectionProtocol for automatic conversion to immutable.

classsets.ImmutableSet([iterable])

Constructs a new emptyImmutableSet object. If the optionaliterableparameter is supplied, updates the set with elements obtained from iteration.All of the elements initerable should be immutable or be transformable to animmutable using the protocol described in sectionProtocol for automatic conversion to immutable.

BecauseImmutableSet objects provide a__hash__() method, theycan be used as set elements or as dictionary keys.ImmutableSetobjects do not have methods for adding or removing elements, so all of theelements must be known when the constructor is called.

8.7.1.Set Objects

Instances ofSet andImmutableSet both provide the followingoperations:

Operation

Equivalent

Result

len(s)

number of elements in sets(cardinality)

xins

testx for membership ins

xnotins

testx for non-membership ins

s.issubset(t)

s<=t

test whether every element ins is int

s.issuperset(t)

s>=t

test whether every element int is ins

s.union(t)

s|t

new set with elements from boths andt

s.intersection(t)

s&t

new set with elements common tos andt

s.difference(t)

s-t

new set with elements insbut not int

s.symmetric_difference(t)

s^t

new set with elements in eithers ort but not both

s.copy()

new set with a shallow copy ofs

Note, the non-operator versions ofunion(),intersection(),difference(), andsymmetric_difference() will accept any iterable asan argument. In contrast, their operator based counterparts require theirarguments to be sets. This precludes error-prone constructions likeSet('abc')&'cbs' in favor of the more readableSet('abc').intersection('cbs').

Changed in version 2.3.1:Formerly all arguments were required to be sets.

In addition, bothSet andImmutableSet support set to setcomparisons. Two sets are equal if and only if every element of each set iscontained in the other (each is a subset of the other). A set is less thananother set if and only if the first set is a proper subset of the second set(is a subset, but is not equal). A set is greater than another set if and onlyif the first set is a proper superset of the second set (is a superset, but isnot equal).

The subset and equality comparisons do not generalize to a complete orderingfunction. For example, any two disjoint sets are not equal and are not subsetsof each other, soall of the following returnFalse:a<b,a==b,ora>b. Accordingly, sets do not implement the__cmp__() method.

Since sets only define partial ordering (subset relationships), the output ofthelist.sort() method is undefined for lists of sets.

The following table lists operations available inImmutableSet but notfound inSet:

Operation

Result

hash(s)

returns a hash value fors

The following table lists operations available inSet but not found inImmutableSet:

Operation

Equivalent

Result

s.update(t)

s |=t

return sets with elementsadded fromt

s.intersection_update(t)

s &=t

return sets keeping onlyelements also found int

s.difference_update(t)

s -=t

return sets after removingelements found int

s.symmetric_difference_update(t)

s ^=t

return sets with elementsfroms ort but not both

s.add(x)

add elementx to sets

s.remove(x)

removex from sets; raisesKeyError if not present

s.discard(x)

removesx from sets ifpresent

s.pop()

remove and return an arbitraryelement froms; raisesKeyError if empty

s.clear()

remove all elements from sets

Note, the non-operator versions ofupdate(),intersection_update(),difference_update(), andsymmetric_difference_update() will acceptany iterable as an argument.

Changed in version 2.3.1:Formerly all arguments were required to be sets.

Also note, the module also includes aunion_update() method which is analias forupdate(). The method is included for backwards compatibility.Programmers should prefer theupdate() method because it is supported bythe built-inset() andfrozenset() types.

8.7.2.Example

>>>fromsetsimportSet>>>engineers=Set(['John','Jane','Jack','Janice'])>>>programmers=Set(['Jack','Sam','Susan','Janice'])>>>managers=Set(['Jane','Jack','Susan','Zack'])>>>employees=engineers|programmers|managers# union>>>engineering_management=engineers&managers# intersection>>>fulltime_management=managers-engineers-programmers# difference>>>engineers.add('Marvin')# add element>>>printengineersSet(['Jane', 'Marvin', 'Janice', 'John', 'Jack'])>>>employees.issuperset(engineers)# superset testFalse>>>employees.update(engineers)# update from another set>>>employees.issuperset(engineers)True>>>forgroupin[engineers,programmers,managers,employees]:...group.discard('Susan')# unconditionally remove element...printgroup...Set(['Jane', 'Marvin', 'Janice', 'John', 'Jack'])Set(['Janice', 'Jack', 'Sam'])Set(['Jane', 'Zack', 'Jack'])Set(['Jack', 'Sam', 'Jane', 'Marvin', 'Janice', 'John', 'Zack'])

8.7.3.Protocol for automatic conversion to immutable

Sets can only contain immutable elements. For convenience, mutableSetobjects are automatically copied to anImmutableSet before being addedas a set element.

The mechanism is to always add ahashable element, or if it is nothashable, the element is checked to see if it has an__as_immutable__()method which returns an immutable equivalent.

SinceSet objects have a__as_immutable__() method returning aninstance ofImmutableSet, it is possible to construct sets of sets.

A similar mechanism is needed by the__contains__() andremove()methods which need to hash an element to check for membership in a set. Thosemethods check an element for hashability and, if not, check for a__as_temporarily_immutable__() method which returns the element wrapped bya class that provides temporary methods for__hash__(),__eq__(),and__ne__().

The alternate mechanism spares the need to build a separate copy of the originalmutable object.

Set objects implement the__as_temporarily_immutable__() methodwhich returns theSet object wrapped by a new class_TemporarilyImmutableSet.

The two mechanisms for adding hashability are normally invisible to the user;however, a conflict can arise in a multi-threaded environment where one threadis updating a set while another has temporarily wrapped it in_TemporarilyImmutableSet. In other words, sets of mutable sets are notthread-safe.

8.7.4.Comparison to the built-inset types

The built-inset andfrozenset types were designed based onlessons learned from thesets module. The key differences are:

  • Set andImmutableSet were renamed toset andfrozenset.

  • There is no equivalent toBaseSet. Instead, useisinstance(x,(set,frozenset)).

  • The hash algorithm for the built-ins performs significantly better (fewercollisions) for most datasets.

  • The built-in versions have more space efficient pickles.

  • The built-in versions do not have aunion_update() method. Instead, usetheupdate() method which is equivalent.

  • The built-in versions do not have a_repr(sorted=True) method.Instead, use the built-inrepr() andsorted() functions:repr(sorted(s)).

  • The built-in version does not have a protocol for automatic conversion toimmutable. Many found this feature to be confusing and no one in the communityreported having found real uses for it.