Movatterモバイル変換

[0]ホーム

Jump to content

Merkle tree

Edit links

From Wikipedia, the free encyclopedia

Type of data structure

An example of a binary hash tree. Hashes 0-0 and 0-1 are the hash values of data blocks L1 and L2, respectively, and hash 0 is the hash of the concatenation of hashes 0-0 and 0-1.

Incryptography andcomputer science, ahash tree orMerkle tree is atree in which every "leaf"node is labelled with thecryptographic hash of a data block, and every node that is not a leaf (called abranch,inner node, orinode) is labelled with the cryptographic hash of the labels of its child nodes. A hash tree allows efficient and secure verification of the contents of a largedata structure. A hash tree is a generalization of ahash list and ahash chain.

Demonstrating that a leaf node is a part of a given binary hash tree requires computing a number of hashes proportional to thelogarithm of the number of leaf nodes in the tree.^[1] Conversely, in a hash list, the number is proportional to the number of leaf nodes itself. A Merkle tree is therefore an efficient example of acryptographic commitment scheme, in which the root of the tree is seen as a commitment and leaf nodes may be revealed and proven to be part of the original commitment.^[2]

The concept of a hash tree is named afterRalph Merkle, who patented it in 1979.^[3]^[4]

Uses

[edit]

Hash trees can be used to verify any kind of data stored, handled and transferred in and between computers. They can help ensure that datablocks received from other peers in apeer-to-peer network are received undamaged and unaltered, and even to check that the other peers do not lie and send fake blocks.

Hash trees are used in:

hash-based cryptography.
InterPlanetary File System (IPFS),
BitTorrent
ZFS file system^[5] (to counterdata degradation^[6]);
Dat protocol;
Apache Wave protocol;^[7]
Git andMercurial distributed revision control systems (although, strictly speaking, they usedirected acyclic graphs, not trees);
theTahoe-LAFS backup system;
Zeronet;
OpenZFS^[8]
theBitcoin andEthereum peer-to-peer networks;^[9]
theCertificate Transparency framework;
theNix package manager and descendants likeGNU Guix;^[10]
a number ofNoSQL systems such asApache Cassandra,Riak, andDynamo.^[11]

Suggestions have been made to use hash trees intrusted computing systems.^[12]

Overview

[edit]

A hash tree is atree ofhashes in which the leaves (i.e., leaf nodes, sometimes also called "leafs") are hashes of datablocks in, for instance, a file or set of files. Nodes farther up in the tree are the hashes of their respective children. For example, in the above picturehash 0 is the result of hashing theconcatenation ofhash 0-0 andhash 0-1. That is,hash 0 =hash(hash 0-0 +hash 0-1 ) where "+" denotes concatenation.

Most hash tree implementations are binary (two child nodes under each node) but they can just as well use many more child nodes under each node.

Usually, acryptographic hash function such asSHA-2 is used for the hashing. If the hash tree only needs to protect against unintentional damage, unsecuredchecksums such asCRCs can be used.

In the top of a hash tree there is atop hash (orroot hash ormaster hash). Before downloading a file on aP2P network, in most cases the top hash is acquired from a trusted source, for instance a friend or a web site that is known to have good recommendations of files to download. When the top hash is available, the hash tree can be received from any non-trusted source, like any peer in the P2P network. Then, the received hash tree is checked against the trusted top hash, and if the hash tree is damaged or fake, another hash tree from another source will be tried until the program finds one that matches the top hash.^[13]

The main difference from ahash list is that one branch of the hash tree can be downloaded at a time and the integrity of each branch can be checked immediately, even though the whole tree is not available yet. For example, in the picture, the integrity ofdata block L2 can be verified immediately if the tree already containshash 0-0 andhash 1 by hashing the data block and iteratively combining the result withhash 0-0 and thenhash 1 and finally comparing the result with thetop hash. Similarly, the integrity ofdata block L3 can be verified if the tree already hashash 1-1 andhash 0. This can be an advantage since it is efficient to split files up in very small data blocks so that only small blocks have to be re-downloaded if they get damaged. If the hashed file is big, such a hash list or hash chain becomes fairly big. But if it is a tree, one small branch can be downloaded quickly, the integrity of the branch can be checked, and then the downloading of data blocks can start.^{[citation needed]}

Second preimage attack

[edit]

The Merkle hash root does not indicate the tree depth, enabling asecond-preimage attack in which an attacker creates a document other than the original that has the same Merkle hash root. For the example above, an attacker can create a new document containing two data blocks, where the first ishash 0-0 +hash 0-1, and the second ishash 1-0 +hash 1-1.^[14]^[15]

One simple fix is defined inCertificate Transparency: when computing leaf node hashes, a 0x00 byte is prepended to the hash data, while 0x01 is prepended when computing internal node hashes.^[13] Limiting the hash tree size is a prerequisite of someformal security proofs, and helps in making some proofs tighter. Some implementations limit the tree depth using hash tree depth prefixes before hashes, so any extracted hash chain is defined to be valid only if the prefix decreases at each step and is still positive when the leaf is reached.

Tiger tree hash

[edit]

The Tiger tree hash is a widely used form of hash tree. It uses a binary hash tree (two child nodes under each node), usually has a data block size of 1024bytes and uses theTiger hash.^[16]

Tiger tree hashes are used inGnutella,^[17]Gnutella2, andDirect Connect P2P file sharing protocols^[18] and infile sharing applications such asPhex,^[19]BearShare,LimeWire,Shareaza,DC++^[20] andgtk-gnutella.^[21]

References

[edit]

^Becker, Georg (2008-07-18)."Merkle Signature Schemes, Merkle Trees and Their Cryptanalysis"(PDF). Ruhr-Universität Bochum. p. 16. Archived fromthe original(PDF) on 2014-12-22. Retrieved2013-11-20.
^"Handbook of Applied Cryptography".cacr.uwaterloo.ca. Section 13.4.1. Retrieved2024-03-07.
^Merkle, R. C. (1988). "A Digital Signature Based on a Conventional Encryption Function".Advances in Cryptology – CRYPTO '87. Lecture Notes in Computer Science. Vol. 293. pp. 369–378.doi:10.1007/3-540-48184-2_32.ISBN 978-3-540-18796-7.
^US patent 4309569, Ralph Merkle, "Method of providing digital signatures", published Jan 5, 1982, assigned to The Board of Trustees of the Leland Stanford Junior University
^Bonwick, Jeff (2005-12-08)."ZFS End-to-End Data Integrity".blogs.oracle.com.Archived from the original on April 3, 2012. Retrieved2013-09-19.
^Likai Liu."Bitrot Resistance on a Single Drive".likai.org.
^"General Verifiable Federation".Google Wave Protocol. Archived fromthe original on 2018-04-08. Retrieved2017-03-09.
^"Introduction to ZFS — openzfs latest documentation".openzfs.readthedocs.io. Retrieved2025-05-27.
^Koblitz, Neal; Menezes, Alfred J. (January 2016). "Cryptocash, cryptocurrencies, and cryptocontracts".Designs, Codes and Cryptography.78 (1):87–102.CiteSeerX 10.1.1.701.8721.doi:10.1007/s10623-015-0148-5.S2CID 16594958.
^Dolstra, E.The Purely Functional Software Deployment Model. PhD thesis, Faculty of Science, Utrecht, The Netherlands. January 2006. p.21ISBN 90-393-4130-3.
^Adam Marcus."The NoSQL Ecosystem".aosabook.org.When a replica is down for an extended period of time, or the machine storing hinted handoffs for an unavailable replica goes down as well, replicas must synchronize from one another. In this case, Cassandra and Riak implement a Dynamo-inspired process called anti-entropy. In anti-entropy, replicas exchange Merkle trees to identify parts of their replicated key ranges which are out of sync. A Merkle tree is a hierarchical hash verification: if the hash over the entire keyspace is not the same between two replicas, they will exchange hashes of smaller and smaller portions of the replicated keyspace until the out-of-sync keys are identified. This approach reduces unnecessary data transfer between replicas which contain mostly similar data.
^Kilian, J. (1995)."Improved Efficient Arguments"(PDF).Advances in Cryptology — CRYPT0' 95. Lecture Notes in Computer Science. Vol. 963. pp. 311–324.doi:10.1007/3-540-44750-4_25.ISBN 978-3-540-60221-7.
^^a ^bLaurie, B.; Langley, A.; Kasper, E. (June 2013)."Certificate Transparency".IETF RFC6962.doi:10.17487/rfc6962.
^Elena Andreeva; Charles Bouillaguet; Orr Dunkelman; John Kelsey (January 2009). "Herding, Second Preimage and Trojan Message Attacks beyond Merkle-Damgård".Selected Areas in Cryptography. Lecture Notes in Computer Science. Vol. 5867. SAC. pp. 393–414.doi:10.1007/978-3-642-05445-7_25.ISBN 978-3-642-05443-3.
^Elena Andreeva; Charles Bouillaguet; Pierre-Alain Fouque; Jonathan J. Hoch; John Kelsey; Adi Shamir; Sebastien Zimmer (2008). "Second Preimage Attacks on Dithered Hash Functions". In Smart, Nigel (ed.).Advances in Cryptology – EUROCRYPT 2008. Lecture Notes in Computer Science. Vol. 4965. Istanbul, Turkey. pp. 270–288.doi:10.1007/978-3-540-78967-3_16.ISBN 978-3-540-78966-6.S2CID 12844017.{{cite book}}: CS1 maint: location missing publisher (link)
^Chapweske, J.; Mohr, G. (March 4, 2003)."Tree Hash EXchange format (THEX)". Archived from the original on 2009-08-03.
^"tigertree.c File Reference".Gtk-Gnutella. Retrieved23 September 2018.
^"Audit: P2P DirectConnect Application".Symantec. Archived fromthe original on January 29, 2015. Retrieved23 September 2018.
^Arne Babenhauserheide (7 Jan 2007)."Phex 3.0.0 released".Phex. Retrieved23 September 2018.
^"DC++'s feature list".dcplusplus.sourceforge.net.
^"Development".GTK-Gnutella. Retrieved23 September 2018.

External links

[edit]

Listen to this article (5 minutes)

This audio file was created from a revision of this article dated 17 September 2013 (2013-09-17), and does not reflect subsequent edits.

(Audio help ·More spoken articles)

A C implementation of a dynamically re-sizeable binary SHA-256 hash tree (Merkle tree)
Merkle tree implementation in Java
Tiger Tree Hash (TTH) source code in C#, by Gil Schmidt
Tiger Tree Hash (TTH) implementations in C and Java
RHash, an open source command-line tool, which can calculate TTH and magnet links with TTH

v t e Cryptographic hash functions andmessage authentication codes
List Comparison Known attacks
Common functions	MD5 (compromised) SHA-1 (compromised) SHA-2 SHA-3 BLAKE2
SHA-3 finalists	BLAKE Grøstl JH Skein Keccak (winner)
Other functions	BLAKE3 CubeHash ECOH FSB Fugue GOST HAS-160 HAVAL Kupyna LSH Lane MASH-1 MASH-2 MD2 MD4 MD6 MDC-2 N-hash RIPEMD RadioGatún SIMD SM3 SWIFFT Shabal Snefru Streebog Tiger VSH Whirlpool
Password hashing/ key stretching functions	Argon2 Balloon bcrypt Catena crypt LM hash Lyra2 Makwa PBKDF2 scrypt yescrypt
General purpose key derivation functions	HKDF KDF1/KDF2
MAC functions	CBC-MAC DAA GMAC HMAC NMAC OMAC/CMAC PMAC Poly1305 SipHash UMAC VMAC
Authenticated encryption modes	CCM ChaCha20-Poly1305 CWC EAX GCM IAPM OCB
Attacks	Collision attack Preimage attack Birthday attack Brute-force attack Rainbow table Side-channel attack Length extension attack
Design	Avalanche effect Hash collision Merkle–Damgård construction Sponge function HAIFA construction
Standardization	CAESAR Competition CRYPTREC NESSIE NIST hash function competition Password Hashing Competition NSA Suite B CNSA
Utilization	Hash-based cryptography Merkle tree Message authentication Proof of work Salt Pepper

v t e Cryptography
General	History of cryptography Outline of cryptography Classical cipher Cryptographic protocol Authentication protocol Cryptographic primitive Cryptanalysis Cryptocurrency Cryptosystem Cryptographic nonce Cryptovirology Hash function Cryptographic hash function Key derivation function Secure Hash Algorithms Digital signature Kleptography Key (cryptography) Key exchange Key generator Key schedule Key stretching Keygen Machines Ransomware Random number generation Cryptographically secure pseudorandom number generator (CSPRNG) Pseudorandom noise (PRN) Secure channel Insecure channel Subliminal channel Encryption Decryption End-to-end encryption Harvest now, decrypt later Information-theoretic security Plaintext Codetext Ciphertext Shared secret Trapdoor function Trusted timestamping Key-based routing Onion routing Garlic routing Kademlia Mix network
Mathematics	Cryptographic hash function Block cipher Stream cipher Symmetric-key algorithm Authenticated encryption Public-key cryptography Quantum key distribution Quantum cryptography Post-quantum cryptography Message authentication code Random numbers Steganography
Category

v t e Tree data structures
Search trees (dynamic sets, associative arrays)	2–3 2–3–4 AA (a,b) AVL B K-Dimensional B+ B* B^x Binary search Optimal Self-balancing Dancing HTree Interval Order statistic Palindrome (Left-leaning) Red–black Scapegoat Splay T Treap UB Weight-balanced
Heaps	Binary Binomial Brodal d-ary Fibonacci Leftist Pairing Skew binomial Skew van Emde Boas Weak
Tries	Ctrie C-trie (compressed ADT) Hash Radix Suffix Ternary search X-fast Y-fast
Spatial data partitioning trees	Ball BK BSP Cartesian Hilbert R k-d (implicitk-d) M Metric MVP Octree PH Priority R Quad R R+ R* Segment VP X
Other trees	Cover Exponential Fenwick Finger Fractal index Fusion Hash calendar iDistance K-ary Left-child right-sibling Link/cut Log-structured merge Merkle PQ Range SPQR Top