Movatterモバイル変換


[0]ホーム

URL:


Following system colour schemeSelected dark colour schemeSelected light colour scheme

Python Enhancement Proposals

PEP 452 – API for Cryptographic Hash Functions v2.0

Author:
A.M. Kuchling <amk at amk.ca>, Christian Heimes <christian at python.org>
Status:
Final
Type:
Informational
Created:
15-Aug-2013
Post-History:

Replaces:
247

Table of Contents

Abstract

There are several different modules available that implementcryptographic hashing algorithms such as MD5 or SHA. Thisdocument specifies a standard API for such algorithms, to make iteasier to switch between different implementations.

Specification

All hashing modules should present the same interface. Additionalmethods or variables can be added, but those described in thisdocument should always be present.

Hash function modules define one function:

new([string])           (unkeyedhashes)

new(key,[string],[digestmod])   (keyedhashes)
Create a new hashing object and return it. The first form isfor hashes that are unkeyed, such as MD5 or SHA. For keyedhashes such as HMAC, ‘key’ is a required parameter containinga string giving the key to use. In both cases, the optional‘string’ parameter, if supplied, will be immediately hashedinto the object’s starting state, as ifobj.update(string) wascalled.

After creating a hashing object, arbitrary bytes can be fedinto the object using itsupdate() method, and the hash valuecan be obtained at any time by calling the object’sdigest()method.

Although the parameter is called ‘string’, hashing objects operateon 8-bit data only. Both ‘key’ and ‘string’ must be a bytes-likeobject (bytes, bytearray…). A hashing object may supportone-dimensional, contiguous buffers as argument, too. Text(unicode) is no longer supported in Python 3.x. Python 2.ximplementations may take ASCII-only unicode as argument, butportable code should not rely on the feature.

Arbitrary additional keyword arguments can be added to thisfunction, but if they’re not supplied, sensible default valuesshould be used. For example, ‘rounds’ and ‘digest_size’keywords could be added for a hash function which supports avariable number of rounds and several different output sizes,and they should default to values believed to be secure.

Hash function modules define one variable:

digest_size
An integer value; the size of the digest produced by thehashing objects created by this module, measured in bytes.You could also obtain this value by creating a sample objectand accessing its ‘digest_size’ attribute, but it can beconvenient to have this value available from the module.Hashes with a variable output size will set this variable toNone.

Hashing objects require the following attribute:

digest_size
This attribute is identical to the module-level digest_sizevariable, measuring the size of the digest produced by thehashing object, measured in bytes. If the hash has a variableoutput size, this output size must be chosen when the hashingobject is created, and this attribute must contain theselected size. Therefore,None isnot a legal value for thisattribute.
block_size
An integer value orNotImplemented; the internal block sizeof the hash algorithm in bytes. The block size is used by theHMAC module to pad the secret key todigest_size or to hash thesecret key if it is longer thandigest_size. If no HMACalgorithm is standardized for the hash algorithm, returnNotImplemented instead.
name
A text string value; the canonical, lowercase name of the hashingalgorithm. The name should be a suitable parameter forhashlib.new.

Hashing objects require the following methods:

copy()
Return a separate copy of this hashing object. An update tothis copy won’t affect the original object.
digest()
Return the hash value of this hashing object as a bytescontaining 8-bit data. The object is not altered in any wayby this function; you can continue updating the object aftercalling this function.
hexdigest()
Return the hash value of this hashing object as a stringcontaining hexadecimal digits. Lowercase letters should be usedfor the digits ‘a’ through ‘f’. Like the.digest() method, thismethod mustn’t alter the object.
update(string)
Hash bytes-like ‘string’ into the current state of the hashingobject.update() can be called any number of times during ahashing object’s lifetime.

Hashing modules can define additional module-level functions orobject methods and still be compliant with this specification.

Here’s an example, using a module named ‘MD5’:

>>>importhashlib>>>fromCrypto.HashimportMD5>>>m=MD5.new()>>>isinstance(m,hashlib.CryptoHash)True>>>m.name'md5'>>>m.digest_size16>>>m.block_size64>>>m.update(b'abc')>>>m.digest()b'\x90\x01P\x98<\xd2O\xb0\xd6\x96?}(\xe1\x7fr'>>>m.hexdigest()'900150983cd24fb0d6963f7d28e17f72'>>>MD5.new(b'abc').digest()b'\x90\x01P\x98<\xd2O\xb0\xd6\x96?}(\xe1\x7fr'

Rationale

The digest size is measured in bytes, not bits, even though hashalgorithm sizes are usually quoted in bits; MD5 is a 128-bitalgorithm and not a 16-byte one, for example. This is because, inthe sample code I looked at, the length in bytes is often needed(to seek ahead or behind in a file; to compute the length of anoutput string) while the length in bits is rarely used.Therefore, the burden will fall on the few people actually needingthe size in bits, who will have to multiply digest_size by 8.

It’s been suggested that theupdate() method would be better namedappend(). However, that method is really causing the currentstate of the hashing object to be updated, andupdate() is alreadyused by the md5 and sha modules included with Python, so it seemssimplest to leave the nameupdate() alone.

The order of the constructor’s arguments for keyed hashes was asticky issue. It wasn’t clear whether the key should come firstor second. It’s a required parameter, and the usual convention isto place required parameters first, but that also means that the‘string’ parameter moves from the first position to the second.It would be possible to get confused and pass a single argument toa keyed hash, thinking that you’re passing an initial string to anunkeyed hash, but it doesn’t seem worth making the interfacefor keyed hashes more obscure to avoid this potential error.

Changes from Version 1.0 to Version 2.0

Version 2.0 of API for Cryptographic Hash Functions clarifies someaspects of the API and brings it up-to-date. It also formalized aspectsthat were already de facto standards and provided by mostimplementations.

Version 2.0 introduces the following new attributes:

name
The name property was made mandatory byissue 18532.
block_size
The new version also specifies that the return valueNotImplemented prevents HMAC support.

Version 2.0 takes the separation of binary and text data in Python3.0 into account. The ‘string’ argument tonew() andupdate() aswell as the ‘key’ argument must be bytes-like objects. On Python2.x a hashing object may also support ASCII-only unicode. The actualname of argument is not changed as it is part of the public API.Code may depend on the fact that the argument is called ‘string’.

Recommended names for common hashing algorithms

algorithmvariantrecommended name
MD5md5
RIPEMD-160ripemd160
SHA-1sha1
SHA-2SHA-224sha224
SHA-256sha256
SHA-384sha384
SHA-512sha512
SHA-3SHA-3-224sha3_224
SHA-3-256sha3_256
SHA-3-384sha3_384
SHA-3-512sha3_512
WHIRLPOOLwhirlpool

Changes

  • 2001-09-17: Renamedclear() toreset(); addeddigest_size attributeto objects; added.hexdigest() method.
  • 2001-09-20: Removedreset() method completely.
  • 2001-09-28: Setdigest_size toNone for variable-size hashes.
  • 2013-08-15: Addedblock_size andname attributes; clarified that‘string’ actually refers to bytes-like objects.

Acknowledgements

Thanks to Aahz, Andrew Archibald, Rich Salz, ItamarShtull-Trauring, and the readers of the python-crypto list fortheir comments on this PEP.

Copyright

This document has been placed in the public domain.


Source:https://github.com/python/peps/blob/main/peps/pep-0452.rst

Last modified:2025-02-01 08:59:27 GMT


[8]ページ先頭

©2009-2025 Movatter.jp