There’s quite a feud within the Bitcoin Cash community whether we need the so called Canonical Transaction Ordering (CTOR), which should replace the current Topological Transaction Ordering (TTOR). Many aren’t convinced of the advantages, which is very understandable. One promise made by CTOR proponents is that it simplifies “sharding”, which is the distribution of the work needed to be done over multiple machines. Let’s have a look if that claim actually holds water. I’ll explain how sharding and Bitcoin nodes work first, feel free to skip these sections if you’re already familiar with those.

How Do We Do Sharding?

Let's have a look at a simple example in order to better understand how sharding works in practice. Suppose run a website that serves images, billions of them. A user requests an image by name, e.g. “cat.png” or “house.jpg”.

However, there are so many images that they can’t reasonably be handled by an individual computer.

So we set up multiple computers.

One naive way to distribute the images would be by initial letter, such that we’d had one server for each letter in the alphabet: Server A, Server B, ..., Server Z. If a user requests “ c at.png”, we’d know that this image is stored on Server C (if the image even exists), and feed the image from that server to the user.

This approach, of course, has its share of disadvantages. Initial letters in English aren’t equally distributed, T has a frequency of 16% and Z has just 0.05%. That means Server T would be 320 busier than Server Z! That is bad sharding.

To solve this, we could use the hash of the image’s name instead, which should be uniformly distributed.

What Do These Bitcoin Nodes Actually Do?

Now let’s get back to Bitcoin. In the whitepaper , it says that a node does the following:

New transactions are broadcast to all nodes. Each node collects new transactions into a block. Each node works on finding a difficult proof-of-work for its block. When a node finds a proof-of-work, it broadcasts the block to all nodes. Nodes accept the block only if all transactions in it are valid and not already spent. Nodes express their acceptance of the block by working on creating the next block in the chain, using the hash of the accepted block as the previous hash.

This list doesn’t contain the important step of verifying if a transaction is valid, however.

We verify a transaction by some trivial rules (valid format, cannot spend more than the inputs contain, etc.) and the following (simplified) rule:

For each input of the transaction, we check if we’re actually allowed to spend the output referenced in the input. We look up the referenced output, and check if it’s unspent and also that if we feed the input script into the output’s script, we get a “ yep, you can spend this! ” result.

So in other words, we go through all inputs of the transaction, find the transaction referenced in that input, pick the output of this referenced transaction the current input we’re checking specified, and verify that it is unspent and that our current input actually contains the necessary signature to spend the referenced output. If not, we drop the transaction, as it is invalid.

After verifying, we do 1) (broadcast the transaction), 2) (collect them into a block) and 3) (do the mining). Note that we don’t do the mining on the actual transactions in the block (which can be megabytes), instead, we mine on the “ merkle root ” (and some other stuff, which is 80 bytes in total). But how do we get the merkle root?

To get the merkle root, we list all transactions in a row, and hash their IDs pairwise, which results in a list that’s half the original list’s size. We then do the same thing with the new list: hash the hashes pairwise. We continue to do this, until only one hash is left – this is our merkle root:

For brevity’s stake, let’s omit 4) (broadcast the block), 5) (verify block) and 6) (mining on top of the new block) in this article, although they are influenced by CTOR a lot. The implications on sharding aren’t as significant, though.

Sharding A Node

I asked myself how I’d implement sharding once we have CTOR. An “order” in this sense means how we should order the transaction IDs before we build a merkle root out of them. A different order yields a different merkle root, even if the transactions are the same – so order matters!

Let’s have a quick look at how CTOR and TTOR differ.

TTOR ( topological transaction order ) sorts the transactions in topological order: if a transaction depends on another transaction, the dependent transaction must come later in the list. Other than that restriction, transactions can be ordered arbitrarily. CTOR ( canonical transaction order (if we wanted to be exact it would be lexicographical transaction order )) sorts the transactions by their ID (the same ID that is hashed in the merkle root). As transaction IDs are completely unique, this always results in a unique order.

Sharding in CTOR

How would we shard a Bitcoin node when we have CTOR? As with our example image service described above, we could shard using the letter of the transaction ID. There are 16 different hexadecimal digits, so we’d have 16 shards: Shard 0, Shard 1, ..., Shard 9, Shard A, ..., Shard F. The genesis block contains the transaction with ID 4 a5e1e4b...a33b, which starts with a 4 , so this one would go to Shard 4 .

How do we verify a transaction in a sharded system? As described above, when we receive a transaction, we check for each input if it’s unspent and also if it contains a valid script (i.e. it contains a valid signature). Each input references a transaction ID, so if the inputs reference transaction 0 6d3...18 and 8 d67...26, we’d ask Shard 0 and Shard 8 to check if the transaction outputs are unspent and if the inputs contain valid signatures, as they know the respective transaction. Simple!

As a bonus, since transaction IDs are uniformly random, we can be pretty certain that each shard should be equally busy.

What about the merkle root?

As you can see in the picture above, the shard can sort its transactions by ID, and then build as many hashes as it’s capable of doing (shown in blue).

Then, the mining server collects all of those hashes and finishes the work done by the shards, resulting in the final merkle root, and can do the mining, as in the picture above.

Out of curiosity, I did a small simulation, to find out how many hashes would have to be sent to the mining server, and as expected, this grows logarithmically with the number of transactions and linearly with the number of shards.

Each hash has 32 bytes, plus some position information (say 8 bytes), so for 10M transactions on 512 shards, we'd have just under 300kB of data sent to the mining server – for transactions that are 3GB in size. This means we have very little communication required and the work is equally distributed – sharding done right!

Sharding in TTOR

If we use topological transaction ordering, things get more difficult. We could shard by the first letter of the transaction ID like we’ve done with CTOR, and this would work for checking if the transaction inputs are valid.

Unfortunately, for building the merkle root, we’re out of luck. A transaction in Shard 6 might depend on transactions in Shard 0 and Shard 9, and we’d have to send all kind of data to all kinds of places to restore a topological order. The advantage of sharding is pretty much gone at that point.

However, we don’t give up that easy. We can shard in a different, although much more complicated, manner, one I’ll explain below:

Shards are both numbered by initial transaction ID letter (0..F) and by some ascending number (1..16). Confirmed transactions are sharded by their ID, as with CTOR. We have a Bloom Filter (a sort of compressed but probabilistic set) for each shard, which tells us which transactions are in the mempool of the shard. When we receive a transaction, we check each input of the transaction for whether it matches the Bloom Filters of the shards: a) If yes, ask the matching shard to verify the input. The Bloom Filter might be a false positive, so we might need to look into a different shard. b) If none of the Bloom Filters match, check the input as with CTOR. Check if there’s any shard that has any of the inputs of the transaction in the mempool: a) If yes, find the shard with the highest number (1...16) which has such an input. Place the transaction in any random shard that has a greater or equal number than the shard and add the transaction ID to the shard’s Bloom Filter. b) If no, place the transaction in any random shard. Each shard sorts the transactions topologically and builds as many hashes for the merkle root as it can – just as in the CTOR version. These hashes are sent to the mining server, which combines the hashes into the merkle root, and mines it. Once a block is mined, move all transactions in the mempools to their shard according to their transaction ID (as with CTOR).





That sounds quite complicated.

Note that the work isn’t equally distributed between the shards anymore: There’s a bias to send more transactions to higher numbered shards, since we must send transactions that already have a transaction in a shard’s mempool to a shard that has a greater or equal number than that. Attackers might be able to exploit this bias to do a Denial of Service attack.

We could reduce that bias by adjusting the function choosing the random shard to prefer lower numbered shards, or we could move a chunk of transactions to a different shard once they become imbalanced.

Wait? What are we doing? We are clearly over-engineering this thing. Surely we’d agree that the CTOR version is much simpler, leaner and more efficient. There’s also no obvious attack vector.

Conclusion

We’ve seen how to shard a system efficiently, and how to it for Bitcoin with CTOR and TTOR. It should be clear that the CTOR version is much simpler than the TTOR version.

Yet, is it worth it? The downsides of CTOR are, for the most part, just the problems involved with any change. Also, many argue that this ordering has been put into place by Satoshi, so it shouldn’t change. But that’s just an appeal to authority, and Satoshi did a lot of mistakes.

Do we need CTOR right now, this November? No. But did we need a raise to 32MB per block? No, either. We did it because once the need to raise arises, it would become more difficult, as the network has grown. The same arguments are also valid for CTOR. The earlier we implement it, the more time we have for the system to mature, to do quality assurance, stress tests, etc.

We should always be ten steps ahead of problems before they arise.

Especially when we want to build the best cash the world has ever seen.