This PEP proposes adding a Set module to the standard Pythonlibrary, and to then make sets a built-in Python type if thatmodule is widely used. After explaining why sets are desirable,and why the common idiom of using dictionaries in their place isinadequate, we describe how we intend built-in sets to work, andthen how the preliminary Set module will behave. The lastsection discusses the mutability (or otherwise) of sets and setelements, and the solution which the Set module will implement.
Sets are a fundamental mathematical structure, and are verycommonly used in algorithm specifications. They are much lessfrequently used in implementations, even when they are the “right”structure. Programmers frequently use lists instead, even whenthe ordering information in lists is irrelevant, and by-valuelookups are frequent. (Most medium-sized C programs contain adepressing number of start-to-end searches through malloc’dvectors to determine whether particular items are present ornot…)
Programmers are often told that they can implement sets asdictionaries with “don’t care” values. Items can be added tothese “sets” by assigning the “don’t care” value to them;membership can be tested usingdict.has_key; and items can bedeleted usingdel. However, the other main operations on sets(union, intersection, and difference) are not directly supportedby this representation, since their meaning is ambiguous fordictionaries containing key/value pairs.
The long-term goal of this PEP is to add a built-in set type toPython. This type will be an unordered collection of uniquevalues, just as a dictionary is an unordered collection ofkey/value pairs.
Iteration and comprehension will be implemented in the obviousways, so that:
forxinS:
will step through the elements of S in arbitrary order, while:
set(x**2forxinS)
will produce a set containing the squares of all elements in S,Membership will be tested usingin andnotin, and basic setoperations will be implemented by a mixture of overloadedoperators:
| | union |
& | intersection |
^ | symmetric difference |
- | asymmetric difference |
==!= | equality and inequality tests |
<<=>=> | subset and superset tests |
and methods:
S.add(x) | Add “x” to the set. |
S.update(s) | Add all elements of sequence “s” to the set. |
S.remove(x) | Remove “x” from the set. If “x” is notpresent, this method raises aLookupErrorexception. |
S.discard(x) | Remove “x” from the set if it is present, ordo nothing if it is not. |
S.pop() | Remove and return an arbitrary element,raising aLookupError if the element isnot present. |
S.clear() | Remove all elements from this set. |
S.copy() | Make a new set. |
s.issuperset() | Check for a superset relationship. |
s.issubset() | Check for a subset relationship. |
and two new built-in conversion functions:
set(x) | Create a set containing the elements of thecollection “x”. |
frozenset(x) | Create an immutable set containing the elementsof the collection “x”. |
Notes:
|&” for intersectionand union. While “+” for union would be intuitive, “*” forintersection is not (very few of the people asked guessed whatit did correctly).+” to add elements to a set, rather than“add”. However, Guido van Rossum pointed out that “+” issymmetric for other built-in types (although “*” is not). Useof “add” will also avoid confusion between that operation andset union.The PEP originally proposed{1,2,3} as the set notation and{-} forthe empty set. Experience with Python 2.3’ssets.py showed thatthe notation was not necessary. Also, there was some risk of makingdictionaries less instantly recognizable.
It was also contemplated that the braced notation would support setcomprehensions; however, Python 2.4 provided generator expressionswhich fully met that need and did so it a more general way.(SeePEP 289 for details on generator expressions).
So, Guido ruled that there would not be a set syntax; however, theissue could be revisited for Python 3000 (seePEP 3000).
To gain experience with sets, a pure python module was introducedin Python 2.3. Based on that implementation, the set and frozensettypes were introduced in Python 2.4. The improvements are:
True).__reduce__ function so that deep copying is automatic.union_update() method became justupdate()._repr method was dropped (the need is met by the newsorted() built-in function).Tim Peters believes that the class’s constructor should take asingle sequence as an argument, and populate the set with thatsequence’s elements. His argument is that in most cases,programmers will be creating sets from pre-existing sequences, sothat this case should be the common one. However, this wouldrequire users to remember an extra set of parentheses wheninitializing a set with known values:
>>>Set((1,2,3,4))# case 1
On the other hand, feedback from a small number of novice Pythonusers (all of whom were very experienced with other languages)indicates that people will find a “parenthesis-free” syntax morenatural:
>>>Set(1,2,3,4)# case 2
Ultimately, we adopted the first strategy in which the initializertakes a single iterable argument.
The most difficult question to resolve in this proposal waswhether sets ought to be able to contain mutable elements. Adictionary’s keys must be immutable in order to support fast,reliable lookup. While it would be easy to require set elementsto be immutable, this would preclude sets of sets (which arewidely used in graph algorithms and other applications).
Earlier drafts ofPEP 218 had only a single set type, but thesets.py implementation in Python 2.3 has two, Set andImmutableSet. For Python 2.4, the new built-in types were namedset andfrozenset which are slightly less cumbersome.
There are two classes implemented in the “sets” module. Instancesof the Set class can be modified by the addition or removal ofelements, and the ImmutableSet class is “frozen”, with anunchangeable collection of elements. Therefore, an ImmutableSetmay be used as a dictionary key or as a set element, but cannot beupdated. Both types of set require that their elements areimmutable, hashable objects. Parallel comments apply to the “set”and “frozenset” built-in types.
This document has been placed in the Public Domain.
Source:https://github.com/python/peps/blob/main/peps/pep-0218.rst
Last modified:2025-02-01 08:55:40 GMT