This PEP proposes a new data structure for thecollections module,called “TransformDict” in this PEP. This structure is a mutable mappingwhich transforms the key using a given function when doing a lookup, butretains the original key when reading.
See the rationale athttps://mail.python.org/pipermail/python-dev/2015-May/140003.htmland for an earlier partial review, seehttps://mail.python.org/pipermail/python-dev/2013-October/129937.html .
Numerous specialized versions of this pattern exist. The most commonis a case-insensitive case-preserving dict, i.e. a dict-like containerwhich matches keys in a case-insensitive fashion but retains the originalcasing. It is a very common need in network programming, as manyprotocols feature some arrays of “key / value” properties in theirmessages, where the keys are textual strings whose case is specified tobe ignored on receipt but by either specification or custom is to bepreserved or non-trivially canonicalized when retransmitted.
Another common request is an identity dict, where keys are matchedaccording to their respective id()s instead of normal matching.
Both are instances of a more general pattern, where a given transformationfunction is applied to keys when looking them up: that function beingstr.lower orstr.casefold in the former example and the built-inid function in the latter.
(It could be said that the patternprojects keys from the user-visibleset onto the internal lookup set.)
TransformDict is aMutableMapping implementation: it faithfullyimplements the well-known API of mutable mappings, likedict itselfand other dict-like classes in the standard library. Therefore, this PEPwon’t rehash the semantics of most TransformDict methods.
The transformation function needn’t be bijective, it can be strictlysurjective as in the case-insensitive example (in other words, differentkeys can lookup the same value):
>>>d=TransformDict(str.casefold)>>>d['SomeKey']=5>>>d['somekey']5>>>d['SOMEKEY']5
TransformDict retains the first key used when creating an entry:
>>>d=TransformDict(str.casefold)>>>d['SomeKey']=1>>>d['somekey']=2>>>list(d.items())[('SomeKey', 2)]
The original keys needn’t be hashable, as long as the transformationfunction returns a hashable one:
>>>d=TransformDict(id)>>>l=[None]>>>d[l]=5>>>lindTrue
As shown in the examples above, creating a TransformDict requires passingthe key transformation function as the first argument (much like creatingadefaultdict requires passing the factory function as first argument).
The constructor also takes other optional arguments which can be usedto initialize the TransformDict with certain key-value pairs. Thoseoptional arguments are the same as in thedict anddefaultdictconstructors:
>>>d=TransformDict(str.casefold,[('Foo',1)],Bar=2)>>>sorted(d.items())[('Bar', 2), ('Foo', 1)]
TransformDict also features a lookup method returning the stored keytogether with the corresponding value:
>>>d=TransformDict(str.casefold,{'Foo':1})>>>d.getitem('FOO')('Foo', 1)>>>d.getitem('bar')Traceback (most recent call last): File"<stdin>", line1, in<module>KeyError:'bar'
The method namegetitem() follows the standardpopitem() methodon mutable mappings.
TransformDict has a simple read-only propertytransform_func whichgives back the transformation function.
Most python-dev respondents found retaining the first user-supplied keymore intuitive than retaining the last. Also, it matches the dictobject’s own behaviour when using different but equal keys:
>>>d={}>>>d[1]='hello'>>>d[1.0]='world'>>>d{1: 'world'}
Furthermore, explicitly retaining the last key in a first-key-retainingscheme is still possible using the following approach:
d.pop(key,None)d[key]=value
while the converse (retaining the first key in a last-key-retainingscheme) doesn’t look possible without rewriting part of the container’scode.
Using a function pair isn’t necessary, since the original key is retainedby the container. Moreover, an encoder / decoder pair would require thetransformation to be bijective, which prevents important use caseslike case-insensitive matching.
Dictionary values are not used for lookup, their semantics are totallyirrelevant to the container’s operation. Therefore, there is no point inhaving both an “original” and a “transformed” value: the transformedvalue wouldn’t be used for anything.
It was asked why we would provide the generic TransformDict constructrather than a specialized case-insensitive dict variant. The answeris that it’s nearly as cheap (code-wise and performance-wise) to providethe generic construct, and it can fill more use cases.
Even case-insensitive dicts can actually elicit different transformationfunctions:str.lower,str.casefold or in some casesbytes.lowerwhen working with text encoded in an ASCII-compatible encoding.
Two other constructor patterns were proposed by Serhiy Storchaka:
d=TransformDict(str.casefold)(Foo=1)
classCaseInsensitiveDict(TransformDict):__transform__=str.casefoldd=CaseInsensitiveDict(Foo=1)
While both approaches can be defended, they don’t follow establishedpractices in the standard library, and therefore were rejected.
A patch for the collections module is tracked on the bug tracker athttp://bugs.python.org/issue18986.
Case-insensitive dicts are a popular request:
Identity dicts have been requested too:
Several modules in the standard library use identity lookups for objectmemoization, for examplepickle,json,copy,cProfile,doctest and_threading_local.
.Net has a genericDictionary class where you can specify a customIEqualityComparer:http://msdn.microsoft.com/en-us/library/xfhwa508.aspx
Using it is the recommended way to write case-insensitive dictionaries:http://stackoverflow.com/questions/13230414/case-insensitive-access-for-generic-dictionary
Java has a specializedCaseInsensitiveMap:http://commons.apache.org/proper/commons-collections/apidocs/org/apache/commons/collections4/map/CaseInsensitiveMap.html
It also has a separateIdentityHashMap:http://docs.oracle.com/javase/6/docs/api/java/util/IdentityHashMap.html
The C++ Standard Template Library features anunordered_mapwith customizable hash and equality functions:http://www.cplusplus.com/reference/unordered_map/unordered_map/
This document has been placed in the public domain.
Source:https://github.com/python/peps/blob/main/peps/pep-0455.rst
Last modified:2025-02-01 08:59:27 GMT