Posted onMar 10

Accountable Privacy in Web3 (3/4)

Now that we've covered the key cryptographic concepts in Web3, let's try and implement them in code! We'll start the article by setting up anElliptic Curve, a key primitive in many modern encryption schemes. Then, we will use the curve to code a simple commitment scheme. Finally, we will implement a very simple Merkle tree since it is a crucial technique used by blockchains to record state.

Please remember thatnone of the implementations below are suitable for production. They are rife with over-simplifications and the code has certainly not been written for efficiency. You will also need to installSageMath, a computer algebra system, to run the following Python code.

Moreover, I would like to recognize the masterful explanations and coding implementations byEmil Bay. They werecrucial in helping me learn this content.This post is meant to assist beginners in their educational journey.

1 - Elliptic Curve

An elliptic curve is a type of mathematical curve defined by an equation of the form:

y^{2} = x^{3} + a x + b y^2 = x^3 + ax + b

Elliptic curves are fundamental to Elliptic Curve Cryptography (ECC), which is used by several blockchain systems for signing and securing transactions. For example, Bitcoin uses thesecp256k1 elliptic curve, defined by the equation:

y^{2} = x^{3} + 7 y^2 = x^3 + 7

Let's set this up as over the samefinite field used by Bitcoin, as follows -

# Using same params as Bitcoinp = 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFC2FF = FiniteField(p)a = 0b = 7E = EllipticCurve(F, [a, b])# Creating fixed generator elementG_x = 0x79BE667EF9DCBBAC55A06295CE870B07029BFCDB2DCE28D959F2815B16F81798G = B = E.lift_x(G_x)

Essentially, a user's private key is a randomly chosen 256-bit integer. This number must be chosen from the domain set of the above curve. Once chosen, the number is multiplied by a publicly known generator elementG, which is an(x,y) point on the curve. The resulting(x,y) point is the user's public key.

# Numb of valid pts on the curve (order)N = G.order()N_field = FiniteField(N)def keypair (private = None):    if private is None:        private = randint(1, N-1)    public = private * G    return (public, private)

You might be wondering why someone can't just take the private key, divide it by the generator, and retrieve the private key? The answer - it's just not possible! Yay cryptography. The mathematical algorithm(s) required to calculate the private key, givenG and the public key, arecomputationally infeasible on modern computers.

To learn more, you can read about the mathematics involvedhere.

Now that we have extracted a public and private key pair from this curve, let's use it to sign an arbitrary messagem:

import hashlibdef sign (m: bytes, private: int):    m_hash = int(hashlib.sha256(m).hexdigest(), 16) % N    k = randint(1, N-1)    R = k * G     r = int(R.xy()[0]) % N    if r == 0:        return sign(m, private)  # Retry if r == 0 (rare case)    s = (m_hash + private*r) * pow(k, -1, N) % N    if s == 0:        return sign(m, private)  # Retry if s == 0 (rare case)    return (r,s)

Here, we are essentially hashing the message using the collision-resistantSHA256 algorithm, generating a random numberk, and then running these values through an algorithm to produce two encrypted outputs. These outputs are then checked against the message, and the signer's public key, to verify the authenticity of the signature.

def verify (sig, m: bytes, public):    r, s = sig    if not (0 < r < N and 0 < s < N):        return False    m_hash = int(hashlib.sha256(m).hexdigest(), 16) % N    u1 = m_hash * pow(s, -1, N) % N    u2 = r * pow(s, -1, N) % N    R = u1 * G + u2 * public    v = int(R.xy()[0]) % N    return v == r

You can also run the following code block to ensure the above implementation actually works as desired:

public_key, private_key = keypair()message = b"Hello, ECDSA!"signature = sign(message, private_key)print("Public Key:", public_key)print("Signature:", signature)print("Verification:", verify(signature, message, public_key))

2 - Pedersen Commitment

As covered earlier, the Pedersen commitment is a cryptographic scheme that allows a value to be committed to while keeping it hidden, but still allowing it to be verified later. It's widely used in privacy-focused blockchain applications.

Given an elliptic curve with a generator pointG, and a second randomly chosen generatorH, a commitment to a valuex is computed as:

C = x G + r H C = xG + rH

Here,r is also a randomly chosen integer.

First, let's implement thesetup algorithm we covered in the last article. In the setup, we will be creating the second random generator from oursecp256k1 curve:

def setup():    x = randint(1, N-1)    H = x * G    return H

Now, let's implement the commitment on an arbitrary messagem with a blinding factort:

def commit(m: bytes, H):    s = int(hashlib.sha256(m).hexdigest(), 16) % N    t = randint(1, N - 1)    C = s*G + t*H    return (C, t, s) def open(C, t, m: bytes, H):    s = int(hashlib.sha256(m).hexdigest(), 16) % N    Cprime = s*G + t*H    return C == Cprime

Moreover, you may remember that Pedersen commitments are additively homomorphic. Let's see what that looks like:

H = setup()x1 = b"Hello"x2 = b"World"C1, t1, s1 = commit(x1, H)C2, t2, s2 = commit(x2, H)s_combined = (s1 + s2) % Nt_combined = (t1 + t2) % NC_combined = s_combined * G + t_combined * Hprint("First secret value:", open(C1, t1, x1, H))print("First secret value:", open(C2, t2, x2, H))print("Additive homomorphism:", C_combined == C1 + C2)

Great! Now we have the basic primitives (digital signatures, commitments) to implement a very basicnon-interactive zero-knowledge proof (NIZK) scheme.

3 - Merkle Tree (Accumulator)

Fig 1: Source:https://en.wikipedia.org/wiki/Merkle_tree

A Merkle tree is a simple data structure where each leaf node represents the hash of a data value, and each node above it is a hash of two lower leaves. The final node at the top of the tree is called the "root", and is generated by recursively hashing all the lower nodes together in pairs. This root, which is essentially a constant-size hexadecimal , serves as a unique commitment to the entire dataset.

Let's start by implementing the hash and tree generator functions -

import hashlibdef hash_data(data):    return hashlib.sha256(data.encode()).hexdigest()def merkle_tree(raw_data_list):    if not raw_data_list:        raise ValueError("Data list cannot be empty.")    # Hash all the elements before tree construction    level = [hash_data(data) for data in raw_data_list]    tree_levels = [level]    while len(level) > 1:        new_level = []        for i in range(0, len(level), 2):            if i + 1 < len(level):                combined_hash = hash_data(level[i] + level[i + 1])            else:                combined_hash = hash_data(level[i] + level[i])  # Duplicate if odd            new_level.append(combined_hash)        tree_levels.append(new_level) # Store this layer        level = new_level  # Move to the next level    return tree_levels, level

Note that we are not only generating a root, but also creating a nested array to store information on every consecutive layer of the tree. You can see what this looks like for yourself by printing out the following -

# Example usagedata = ["D1", "D2", "D3", "D4"]tree, root = merkle_tree(data)print("Merkle Tree:", tree)print("Merkle Root:", root)

As we had covered in a previous article, data accumulators like Merkle trees are particularly useful because they allow user to efficiently verify that a certain item exists in a data structure, without knowing the entire structure. This is done using something called aMerkle Proof. Let's implement one.

def get_merkle_proof(data_list, target_data):    """Generates a Merkle proof for a given data item."""    tree_levels, merkle_root = merkle_tree(data_list)    hashed_data = hash_data(target_data)    if hashed_data not in tree_levels[0]:  # Check if data is in tree        raise ValueError("Target data not found in Merkle tree.")    index = tree_levels[0].index(hashed_data)  # Find the index of the leaf    proof_path = []    proof_order = []  # Store whether the target is on the left or right    for level in tree_levels[:-1]:  # Exclude root level        sibling_index = index + 1 if index % 2 == 0 else index - 1        if sibling_index < len(level):  # Check if sibling exists            proof_path.append(level[sibling_index])            proof_order.append(index % 2 == 0)  # True if left, False if right        index //= 2  # Move up the tree    return proof_path, proof_order, hashed_data, merkle_root[0]  # Return proof path

This function generates the list of hashes a verifier needs to know in order to verify any one item,target_data in this case. Now this output can be fed to the verifier as follows -

def verify_merkle_proof(leaf_hash, proof_path, proof_order, merkle_root):    """Verifies a Merkle proof by reconstructing the Merkle root."""    computed_hash = leaf_hash    for sibling, is_left in zip(proof_path, proof_order):        if is_left:            computed_hash = hash_data(computed_hash + sibling)  # Target was on the left        else:            computed_hash = hash_data(sibling + computed_hash)  # Target was on the right    return computed_hash == merkle_root

All we are doing here is simply retracing the hash path from the leaf to the root of the tree, using only those elements (proof path) that we need to do so. Knowing whether a certain element is on the 'left' or 'right' of the binary tree is crucial in order to correctly regenerate the tree from any given leaf.

target = "D3"# Generate Merkle proofproof_path, proof_order, leaf_hash, merkle_root = get_merkle_proof(data, target)# Verify the proofis_valid = verify_merkle_proof(leaf_hash, proof_path, proof_order, merkle_root)# Print resultsprint("\n--- Merkle Proof Verification ---")print("Target Leaf Hash:", leaf_hash)print("Merkle Proof Path:", proof_path)print("Proof Order:", proof_order)print("Merkle Root:", merkle_root)print("Merkle Proof Valid:", is_valid)

Hopefully, when you run this entire code block, you should see that the value"D3" indeed exists in this data structure and can be proven simply by knowing 3, not all 7 elements of the original Merkle tree!

Conclusion

By now, you should be familiar with foundational cryptographic primitives—Elliptic curves, Pedersen commitments, and Merkle trees. Together, these concepts are essential for securing blockchain networks and enabling privacy-preserving applications.

In the next article, we will tie these concepts back to the real-world design space for privacy preserving blockchain applications.