The blue keys are not sparse as they diverge only by their last bit, so they keep the same position at the bottom of the tree. This shows the importance of using random keys in a sparse Merkle tree.

Merkle proofs

Using a binary tree gives us very simple and easy-to-use Merkle proofs, unlike a Patricia tree, where a Merkle proof is composed of all the nodes in the path of a key. Since each node in a Patricia tree has 16 children, the quantity of data is larger and not as straightforward to verify as it would be in binary.

In the above example, the Merkle proof of the red key is composed of the node in red: [h3]. Notice that this proof is much shorter than the one in the previous example ([D0, D1, D2, h3]) because values are not stored at height 0. The verifier of the proof will compute Hash(LeafNode, h3) and check that the Root is as expected.

Compressed Merkle proofs: Like in the standard sparse Merkle tree, Merkle proofs can also be compressed. We can use a bitmap and set a bit for every index that is not default in the proof. The proof that the blue LeafNode1 is included in the tree is: [LeafNode2, D1, D2, LeafNode]. This proof can be compressed to 1001[LeafNode2, LeafNode]. The verifier of the compressed Merkle proof should know to use D1 to compute h2 because the second index of the bitmap is 0, and D2 for the third proof element, etc.

Proofs of non-inclusion:

There are 2 ways to prove that a key is not included in the tree:

prove that the Leaf node of another key is included in the tree and is on the path of the non-included key.

or prove that a default node (byte(0)) is included in the tree and is on the path of the non-included key.

For example, a proof that key=0000 is not included in the tree is a proof that LeafNode is on the path of key and is included in the tree. A proof that key=1111 is not included in the tree is a proof that D2 is on the path of the key and is included in the tree.

Deleting from the tree

When a leaf is removed from the tree, special care is taken by the Update() function to keep leaf nodes at the highest subtree containing only 1 key. Otherwise, if a node has a different position in the tree, the resulting trie root would be different even though keys and values are the same.

So, when a key is deleted, Update() checks if it’s sibling is also a leaf node and moves it up until the highest subtree root containing only that non-default key.

Figure 3. Example of what the tree would look like if one of the blue nodes is modified

Node batching

When storing each node as a root with 2 children, the quantity of nodes to store grows very quickly and a bottleneck happens due to multiple threads loading nodes from memory. A hex Merkle tree would solve this problem as each key has 16 children and a smaller height of 64 (256/4), though as we said earlier, we need the tree to be binary. We can achieve the same features of a hex tree by using node batching.

Instead of storing 2 children for one node, we store the subtree of height 4 for that node. A tree of height 4 has 16 leaves at height 0 (like hex). So, the value of a node is an array containing all the nodes of the 4-bit tree. The children of a node at index i in the tree can be found at index 2*i+1 and 2*i+2.

A node is encoded as follows:

{ Root : [ [ byte(0/1) to flag a leaf node ], 3–0, 3–1, 2–0, 2–1, 2–2, 2–3, 1–0, 1–1, 1–2, 1–3, 1–4, 1–5, 1–6, 1–7, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, f ] }

For example, to get the children of node 3–0 at index id=1 in the array, we can access the left child 2–0 at index (2 * id + 1) = index 3 and the right child 2–1 at index (2 * id + 2) = index 4.

To each node, we append a byte flag to recognize the leaf nodes. Since the nature of Root is not know ahead of time, the byte flag is stored at the first index of the nodes array.

Figure 4. A visual representation of node batching. The first batch is blue, and all 16 leaves of a batch are roots to other batches (green). A batch contains 30 nodes.

The example from Figure.2 will be encoded as follows :

{Root : [ [byte(0)], LeafNodeHash, h3, LeafNodeKey, LeafNodeValue, h2, D2=nil, nil, nil, nil, nil, h1, D1=nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, LeafNode1Hash, LeafNode2Hash, nil, nil, nil, nil, nil, nil ]}

Where LeafNodeHash = Hash(key, value, height)

To store the batch in the database, it is serialised with a bitmap which allows us to store only the non default nodes. The bitmap is 4 bytes = 32 bits long. The first 30 bits are for the batch nodes, the 31st bit is the flag to make a shortcut batch (a batch that only contains a key and a value at 3–0 and 3–1), the 32nd bit is not used.

The example from Figure.2 will be serialised as follows:

11111000001000000000001100000010 [LeafNodeHash][h3][LeafNodeKey][LeafNodeValue][h2][h1][LeafNode1Hash][LeafNode2Hash]

Node batching has two benefits:

Reduced number of database reads

Concurrent update of the height 4 subtree without the need for a lock.

Simultaneous update of multiple keys with goroutines

Instead of updating keys one by one, it is possible to update any number of keys with one update call that will concurrently update different parts of the tree. The keys should be sorted in an array, and the corresponding values should be at the same index in a separate array (see Usage).

To delete a key from the tree, simply set it’s value to the default value byte(0).

Usage

NewTrie

func NewTrie(root []byte, hash func(data …[]byte) []byte, store db.DB) *Trie {

When creating an empty tree, set root to nil. A nil root means that it is equal to the default value of its height. Use a custom hash function or use the Hasher in utils and specify a database if you plan to commit nodes.

Update

func (s *Trie) Update(keys, values [][]byte) ([]byte, error) {

‘keys [][]byte’ is a sorted array of keys, ‘values [][]byte’ contains the matching values of keys.

Update will recursively go down the tree and split the keys and values according to the side of the tree they belong to: multiple parts of the tree can be simultaneously updated.

If update is called several times before Commit, only the last state is committed.

AtomicUpdate

func (s *Trie) AtomicUpdate(keys, values [][]byte) ([]byte, error) {

AtomicUpdate updates the tree with sorted keys and values just like Update. But unlike update, if AtomicUpdate is called several times before Commit, all the intermediate states from AtomicUpdate calls will be recorded. This can be useful when authenticating the state of each block, but not committing to the database right away.

Get

func (s *Trie) Get(key []byte) ([]byte, error) {

Get the value of a key stored in the tree, if a key is default, i.e., not stored, return nil.

Commit

func (s *Trie) Commit() error {

Commit the updated nodes to the database. When update is called, the new nodes are stored in smt.db.updatedNodes. Commit then stores to disk.

StageUpdates

func (s *Trie) StageUpdates(txn *db.Transaction) {

StageUpdates loads the updated nodes into the given database transaction. It enables the commit of the trie with an external database transaction.

Stash

func (s *Trie) Stash(rollbackCache bool) error {

Use the Stash function to revert the update without committing.

Revert

func (s *SMT) Revert(toOldRoot []byte) error {

When revert is called, the trees to rollback (between the current tree and toOldRoot ) are deleted from the database.

MerkleProof

func (s *Trie) MerkleProof(key []byte) ([][]byte, bool, []byte, []byte, error) {

MerkleProof creates a Merkle proof of inclusion/non-inclusion of the key. The Merkle proof is an array of hashes.

If the key is not included, MerkleProof will return false along with the proof leaf on the path of the key.

MerkleProofPast

func (s *Trie) MerkleProofPast(key []byte, root []byte) ([][]byte, bool, []byte, []byte, error) {

MerkleProofPast creates a Merkle proof of inclusion/non-inclusion of the key at a given trie root. This is used to query state at a different block than the latest one.

MerkleProofCompressed

func (s *Trie) MerkleProofCompressed(key []byte) ([]byte, [][]byte, uint64, bool, []byte, []byte, error) {

MerkleProofCompressed creates the same Merkle proof as MerkleProof but compressed using a bitmap

VerifyInclusion

func (s *Trie) VerifyInclusion(ap [][]byte, key, value []byte) bool {

Verifies that the key-value pair is included in the tree at the current Root.

VerifyNonInclusion

func (s *Trie) VerifyNonInclusion(ap [][]byte, key, value, proofKey []byte) bool {

Verify a proof of non-inclusion. Verifies that a leaf(proofKey, proofValue, height) or an empty subtree is on the path of the non-included key.

VerifyInclusionC

func (s *Trie) VerifyInclusionC(bitmap, key, value []byte, ap [][]byte, length int) bool {

Verifies a compressed proof of inclusion. ‘length’ is the height of the leaf key-value being verified.

VerifyNonInclusionC

func (s *Trie) VerifyNonInclusionC(ap [][]byte, length int, bitmap, key, value, proofKey []byte) bool {

Verify a compressed proof of non-inclusion. Verifies that a leaf (proofKey, proofValue, height) or an empty subtree is on the path of the non-included key.

Conclusion

Communication across distinct blockchain systems is a functionality that will be needed for the future of distributed networking and enterprise blockchain. Facilitating chain to chain transfers of value or data in business require efficient means of authenticating state. The standard sparse Merkle tree didn’t quite meet the performance standards we had in mind, so we’ve modified it and open-sourced our implementation. Aergo StateTrie will be used in the Aergo platform for fast and efficient authentication of state in the process of asset bridging and other applications relying on secure state verification like wallets and light clients.

If you know of other interesting implementations or would like to engage with us regarding any technical aspects of Aergo, please join us on any of our channels:

This post is available in Korean here.

Related Articles and Other Implementations