Merkel Trees

Merkel Tree isn’t a new concept in Computer Science, it has been around for decades and originates from the field of Cryptography.

To simply put, Merkel Trees are essentially a tree data structure in which data is stored in the leaf nodes and non leaf nodes store hashes of data with each non-leaf node being the combined hash value of the two nodes below it.

Mathematically, it can be expressed as

Computing the value of each node in a Merkel Tree

For example: Given a list of alphabets, create a merkel tree from it.

The bottom most layer of the tree would contain all the letters as the leaf nodes.

Lowest layer of the tree would contain the data in each node

The layer above contains its hash values.

The layer above the leaf node has the hash values of leaf node data

The nodes in the layer after the second layer are contains the hash value of the child nodes. Generally we take two nodes from the second layer and combine them to form another node. We can take more than two nodes as well but binary merkel trees is the simplest of them all and increasing the degree of nodes only increase the computation and algorithms complexity.

If we have even number of nodes, we take 2 consecutive nodes and form the parent layer. But if we have odd number of nodes, we take two consecutive nodes until one is left to form the parent layer, and then we repeat the remaining node by copying the hash to the parent layer.

Layer 3 has the hash of the values of the 2 consecutive nodes of layer 2 and in case we have odd nodes in a layer the last node is repeated

Similarly the fourth layer is formed using the values of the third layer.

The fourth layer is formed by the hash of the values of the 2 consecutive nodes of layer 2

The final layer or the root of the Merkel Tree is formed by hash value of the last two nodes remaining in top most layer. In any case, odd or even leaf nodes, we will always have two nodes in the top most layer.

Merkel Tree formed by the five letters

Verification

The importance Merkel Trees is in its ability to verify data with efficiency. Given any data from the list we can verify in O(h) time complexity that this data is valid or not. Moreover, we do not need the entire list for verification.

A much simpler form of Merkel Tree is a hash chain or simply a blockchain in which each node has the hash of the previous node’s value. If we tamper any node in between we can identify in O(n) time whether the node is tampered or not. Verification in hash chain can be performed by calculating the hash of all the nodes starting from the node in question and go till the end. In a situation where we have with multiple nodes to verify, we start with the node that is first among all the suspected nodes and calculate the hash of the last node from then on. Now that we have the hash of last node, we can compare and check if this hash matches or not. Hash chain seems simple but is not an efficient choice for large data objects. Since we need the entire chain physically present with us to verify the data, it makes hash chains space inefficient as well.

This is not the case with verification in Merkel Tree. To illustrate the verification process consider the example below.

Suppose I received a data C from another server. Lets say this is C’. We want to verify C’ is not tampered. We have in out possession a merkel tree of all the data in the list.

In case of a hash chain we would need the entire list of data to verify that C’ is correct. In Merkel Tree we only need the hashes. Following diagram illustrates how we can verify, C’ without other data objects available with us.

Verifying C’ by hashing all the nodes that lead us to the root

Find the position of the C’ in the list. Probably by searching by id. Calculate the the hash of C’ Calculate the value of the parent node by hashing the current node with its neighbor ( next if position is odd and previous if position in even) and set the parent as the current node. Repeat step 3 until we find the root Compare the root with the previous root, if they match then C’

Compare the new root with the existing root. If the new root matches then the C’ is essentially C and not tampered.

To verify a data in hash chain we need O(n) time since we would calculate n hashes in the worst case where as in case of Merkel Tree the same data can be verified in O(logn) time since we only calculate logn hashes.

Algorithms

This section describes the algorithms in mathematical form for creation and verification in Merkel Trees.

Creation

As already mentioned before, Merkel Tree are created by taking two nodes from each layer and hashing them to create the parent node. by representing the tree in matrix form we can mathematically write it as:

This makes the root of the tree available at tree[0][0]

Verification

Verification is a bottom-up approach where we start from the data, find its hash and calculate the parent and continue this until we find the root. Mathematically, we can express it as follows:

Implementation

We will be implementing a Merkel Tree in Node.js

Prerequisites

Node.js VS Code Coffee

Code

Create your project directory and cd into it.

mkdir merkel-and-patricia && cd merkel-and-patricia

Open VS Code in this directory

code .

Before we implement our we need to create a function for hashing data. So create a file named helper.js and the following code in it.

We will be using this file for hashing our data in the rest to the project. Next we’ll create our Transaction class.

Transaction Class will contain the following properties:

to from amount id hash

Create a file Transaction.js and add the following code in it.

TODO: Transaction Class

The function getCount and incrementCount are used to provide the transaction an id. You may use uuids instead of this.

Inorder to store all the transaction we will create a transaction list class which will contain a array of transactions.

Create a file TransactionList.js and add the following code in it.

We have our hashing function and our data. Lets implement the Merkel Tree.

Create a file named MerkelTree.js and create a MerkelTree Class with only on property root which is the Matrix that will hold our entire tree.

In this class create a method with name createTree which takes only on parameter, TransactionList instance and creates a Merkel Tree from it.

createTree method will first take the transactionList add it to the bottom most layer and the transaction hashes right above them. Next it will take two items from the top most layer and hash them together and save in the temporary list until all the items are covered, in case a single item remains, it is pushed into the temporary array directly and the temporary list temp is added to the beginning of the root This process is repeated until the first item of the root has length equal to one, which would indicate that we have found the root hash.

Now that we have a tree created. Lets write a method to verify a transaction.

The verification will use the same algorithm described above, take the neighboring node and node to verify, hash them and move to the parent layer and do the same but the node to verify will the hash we calculated before.

Create a function verfiy in the merkel tree class, taking a single parameter, a transaction.

Our MerkleTree class is complete. Below is the entire code for the class.

To test the functionality create a js file in the root directory , name it test.js and add the following code in it.

It should print the following output:

Element found at: 2

Valid

Element found at: 2

Not Valid

You can uncomment the console logs to print the entire root and tampered transaction.

This completes the section on Merkel Tree. Up next is Patricia Trie.