Sharding to the Rescue: Solving the Scalability Trilemma

Blockchain sharding seems to be a promising avenue for solving the scalability trilemma. In sharding, the execution of transactions is not fully replicated across all nodes. This in theory can provide a constant-factor increase in scalability in the number of shards. In theory, of course, because there are a number of caveats! The analysis that follows is mostly targeted at Eth 2.0 due to my familiarity with its lexicon, but should generally apply to all sharded blockchains.

How Sharding Provides Scalability

In this section I’ll provide an intuitive overview of the required features of a secure and scalable sharded blockchain that are formally proven in the paper “Divide and Scale: Formalization of Distributed Ledger Sharding Protocols.” The ensuing Twitter discussion between Buterin and the author of the paper is recommended reading.

I said above that sharding involves not fully replicating transaction execution. But how exactly is this accomplished? If we had every node validating every shard chain, then that wouldn’t be sharding — it would have the same scalability profile of a blockchain with big blocks of the same size as all the shards put together. If we allowed every node to select which shard they are responsible for validating at every block, then a single shard could get corrupted easily by even a weak adversary. This would allow a state safety violation (such as the invalid printing of coins) to happen on one shard, then later affect all other shards.

The solution is that validators must be shuffled, or rotated, into committees, each committee comprising a subset of the total validators. This shuffling, and the responsibilities given to each validator, must be known by the system so as to be able to assign blame and levy penalties in the event of provable misbehavior (which unfortunately precludes the use of VRFs). Leaving implementation details aside, validators in a single committee produce blocks on a single shard and attest to their validity for some period of time before being shuffled to a new committee on a (potentially) different shard.

Potential pitfall 1: if validators need to catch up to the tip of the shard they’re assigned to by downloading and executing all shard blocks since the last time they were assigned to that shard, then sharding would provide no scalability and would essentially be big blocks. This is solved with two features. First, the stateless client concept is used, where only a state root is needed to execute transactions, each of which provides the necessary witnesses into the state database. This prevents storing large state, but ensuring the state root committed to at the tip of the shard chain is correct would still require processing all blocks. Second, an any-trust (i.e. there exist a single honest party) assumption on the validators assigned to the shard previously is used. So long as one of them is honest, they can produce a fraud proof that the committed state root is invalid.

Potential pitfall 2: validators can’t create a fraud proof on a shard block if that shard block is withheld. Since a majority of a shard’s committee can sign off on its validity optimistically, a colluding majority could create an invalid block and withhold its data. The honest validator(s) would then have to ask every other validator globally to download the shard block. Due to speaker-listener fault equivalence, validators can’t be punished for raising a false alarm on data withholding, so this would mean all validators would have to download all shard blocks all the time in adversarial conditions — again, this is essentially big blocks. Alternatively, the system could be made scalable at the cost of security by using an honest majority assumption for all shard committees, but that assumption is unrealistic. The solution to this is data availability proofs, and indeed data availability proofs are the fundamental mathematical primitive without which sharded blockchains could not be simultaneously secure and scalable.

With these potential pitfalls solved, it indeed seems as though “all the research breakthroughs we need for a full implementation of eth2” are behind us and only implementation details need to be ironed out for the deployment of a scalable and secure sharded blockchain that allows full nodes to be ran on Raspberry Pis. This is wrong.