There are several different modules available that implementcryptographic hashing algorithms such as MD5 or SHA. Thisdocument specifies a standard API for such algorithms, to make iteasier to switch between different implementations.
All hashing modules should present the same interface. Additionalmethods or variables can be added, but those described in thisdocument should always be present.
Hash function modules define one function:
new([string]) (unkeyedhashes)
new(key,[string],[digestmod]) (keyedhashes)obj.update(string) wascalled.After creating a hashing object, arbitrary bytes can be fedinto the object using itsupdate() method, and the hash valuecan be obtained at any time by calling the object’sdigest()method.
Although the parameter is called ‘string’, hashing objects operateon 8-bit data only. Both ‘key’ and ‘string’ must be a bytes-likeobject (bytes, bytearray…). A hashing object may supportone-dimensional, contiguous buffers as argument, too. Text(unicode) is no longer supported in Python 3.x. Python 2.ximplementations may take ASCII-only unicode as argument, butportable code should not rely on the feature.
Arbitrary additional keyword arguments can be added to thisfunction, but if they’re not supplied, sensible default valuesshould be used. For example, ‘rounds’ and ‘digest_size’keywords could be added for a hash function which supports avariable number of rounds and several different output sizes,and they should default to values believed to be secure.
Hash function modules define one variable:
digest_sizeHashing objects require the following attribute:
digest_sizeNone isnot a legal value for thisattribute.block_sizeNotImplemented; the internal block sizeof the hash algorithm in bytes. The block size is used by theHMAC module to pad the secret key todigest_size or to hash thesecret key if it is longer thandigest_size. If no HMACalgorithm is standardized for the hash algorithm, returnNotImplemented instead.namehashlib.new.Hashing objects require the following methods:
copy()digest()hexdigest().digest() method, thismethod mustn’t alter the object.update(string)update() can be called any number of times during ahashing object’s lifetime.Hashing modules can define additional module-level functions orobject methods and still be compliant with this specification.
Here’s an example, using a module named ‘MD5’:
>>>importhashlib>>>fromCrypto.HashimportMD5>>>m=MD5.new()>>>isinstance(m,hashlib.CryptoHash)True>>>m.name'md5'>>>m.digest_size16>>>m.block_size64>>>m.update(b'abc')>>>m.digest()b'\x90\x01P\x98<\xd2O\xb0\xd6\x96?}(\xe1\x7fr'>>>m.hexdigest()'900150983cd24fb0d6963f7d28e17f72'>>>MD5.new(b'abc').digest()b'\x90\x01P\x98<\xd2O\xb0\xd6\x96?}(\xe1\x7fr'
The digest size is measured in bytes, not bits, even though hashalgorithm sizes are usually quoted in bits; MD5 is a 128-bitalgorithm and not a 16-byte one, for example. This is because, inthe sample code I looked at, the length in bytes is often needed(to seek ahead or behind in a file; to compute the length of anoutput string) while the length in bits is rarely used.Therefore, the burden will fall on the few people actually needingthe size in bits, who will have to multiply digest_size by 8.
It’s been suggested that theupdate() method would be better namedappend(). However, that method is really causing the currentstate of the hashing object to be updated, andupdate() is alreadyused by the md5 and sha modules included with Python, so it seemssimplest to leave the nameupdate() alone.
The order of the constructor’s arguments for keyed hashes was asticky issue. It wasn’t clear whether the key should come firstor second. It’s a required parameter, and the usual convention isto place required parameters first, but that also means that the‘string’ parameter moves from the first position to the second.It would be possible to get confused and pass a single argument toa keyed hash, thinking that you’re passing an initial string to anunkeyed hash, but it doesn’t seem worth making the interfacefor keyed hashes more obscure to avoid this potential error.
Version 2.0 of API for Cryptographic Hash Functions clarifies someaspects of the API and brings it up-to-date. It also formalized aspectsthat were already de facto standards and provided by mostimplementations.
Version 2.0 introduces the following new attributes:
nameblock_sizeNotImplemented prevents HMAC support.Version 2.0 takes the separation of binary and text data in Python3.0 into account. The ‘string’ argument tonew() andupdate() aswell as the ‘key’ argument must be bytes-like objects. On Python2.x a hashing object may also support ASCII-only unicode. The actualname of argument is not changed as it is part of the public API.Code may depend on the fact that the argument is called ‘string’.
| algorithm | variant | recommended name |
|---|---|---|
| MD5 | md5 | |
| RIPEMD-160 | ripemd160 | |
| SHA-1 | sha1 | |
| SHA-2 | SHA-224 | sha224 |
| SHA-256 | sha256 | |
| SHA-384 | sha384 | |
| SHA-512 | sha512 | |
| SHA-3 | SHA-3-224 | sha3_224 |
| SHA-3-256 | sha3_256 | |
| SHA-3-384 | sha3_384 | |
| SHA-3-512 | sha3_512 | |
| WHIRLPOOL | whirlpool |
clear() toreset(); addeddigest_size attributeto objects; added.hexdigest() method.reset() method completely.digest_size toNone for variable-size hashes.block_size andname attributes; clarified that‘string’ actually refers to bytes-like objects.Thanks to Aahz, Andrew Archibald, Rich Salz, ItamarShtull-Trauring, and the readers of the python-crypto list fortheir comments on this PEP.
This document has been placed in the public domain.
Source:https://github.com/python/peps/blob/main/peps/pep-0452.rst
Last modified:2025-02-01 08:59:27 GMT