Cypherium | The stunner of Facebook’s Libra — Exclusive interview with the first author of HotStuff’s paper, Maofan “Ted” Yin Cypherium Follow Jun 23, 2019 · 9 min read

The Libra whitepaper recently released by Facebook has attracted continuous attention from experts and technologists in a wide array of fields. The document mentions that the Libra blockchain will use the LibraBFT consensus algorithm based on the Byzantine-fault-tolerant consensus, and that LibraBFT is a variant of HotStuff. Cypherium has also been implementing permissionless HotStuff for several months now. So, what is this algorithm called HotStuff?

The HotStuff algorithm research paper was published by the virtualization company VMWare’s research team, and its safety and liveness have been fully mathematically proven. There are five authors, Maofan Yin, Dahlia Malkhi, Michael K. Reiter, Guy Golan Gueta, Ittai Abraham. The first author of the paper, Ted Yin, recently sat down for an interview with us. In it, he explains the details and value of HotStuff.

Ted Yin graduated from Shanghai Jiaotong University, and he is currently studying for a Ph.D. at Cornell University in the U.S. His primary interests lie in the basic research of distributed systems. His advisors are the famous computer scientists Emin Gün Sirer and Robbert van Renesse. Ted is also the co-founder and chief systems architect of Ava Labs.

Those who enter the field of distributed consensus algorithms for the first time are easily stunned by a lot of names and acronyms. In-depth study will provide you with various methods of their derivation. For example, the DLS algorithm is an abbreviation for three authors: Dwork, Lynch, and Stockmeyer. And PBFT refers to “Practical Byzantine Fault Tolerance.” So, what is the name of the newcomer HotStuff in this species?

Ted explained that HotStuff has three distinct meanings in English: first, HotStuff refers to socially attractiveness; another is a matter of pique interest; and the last, the name of a little demon in an population animation series. Everyone knows that Ethereum’s next-generation consensus algorithm, Casper, comes from an animated character. So HotStuff, Ted explains, can be compared to that.

During the interview, Ted took the opportunity to translate HotStuff into the Chinese word “尤物,” removing some of its more sensational valences. Ted said that his translation — “the Stunner” — has two meanings: one is a peerless beauty, the other a rare treasure. According to reports, HotStuff has been deployed on a network with more than 100 replicas, exceeding the throughput of BFT-SMaRt while maintaining a similar delay, and in the more practical test environment performs higher than the latter.

The following is a Q&A between Cypherium’s CEO Sky Guo and Ted, where the two discuss the new algorithm, its innovations, and its place in the current DLT landscape:

Sky Guo: Consensus agreements on distributed systems can be roughly divided into two categories, the first being blockchain algorithm represented by bitcoin (or Nakamoto consensus), and the other, a classic BTF algorithm (such as DLS, PBFT). What are the big differences and advantages and disadvantages between the two in terms of application conditions and performance?

Ted Yin: The difference between the two can be roughly divided into five aspects: 1) member information 2) performance, including throughput, delay, etc. 3) Sybil attack — Nakamoto consensus comes with anti-Sybil attack, while classic BFT requires additional PoS or PoW 4) scalability 5) security, i.e. probability vs determinism.

Nakamoto consensus does not need to know all consensus participants in advance, and therefore, does not require accurate member information. Because part of the consensus uses PoW (Proof-of-Work), it is inherently immune to Sybil attacks.

The algorithm of Nakamoto consensus is very simple, and ordinary people can understand it with a little mathematical background.

Because the difficulty of PoW and the length of waiting on the chain are related to safety, the performance is fundamentally limited, and the significant delay of transaction confirmation cannot be changed. All the existing “scalability magic reforms” based on the Nakamoto consensus, in fact, can only increase throughput. Aside from the delay in talking about throughput, it doesn’t really make much sense. For example, I can drive a truck and transport a bunch of hard drives to transport data. Although it is ultra-high throughput, it is also extremely high in latency. Relatively speaking, Nakamoto consensus is easy to scale, with less performance difference at between 10 nodes and 1000 nodes (on one hand because there is no need for broadcast voting, and on the other hand because it is inherently slow).

Another essential difference between the Nakamoto consensus and the classic Byzantine consensus is that the former only provides probabilistic safety guarantee, while the latter is 100% safe. The safety mentioned here, or consistency, is whether it can avoid double-spending.

In fact, the probability that Bitcoin can produce double-spending in six blocks is not as low as everyone thinks, and there is a consensus failure of up to 13% (i.e., 30% of Byzantine). From this point of view, if you want to compare fairly, the efficiency of the Nakamoto consensus is very low. (six blocks would have been an hour)

The premise of the classic Byzantine consensus is that all participants need to know 100% accurate member information.

Because the consensus does not use meaningless PoW, the speed of the classic Byzantine consensus protocol is related to the speed at which the network sends a large number of short messages, without the additional energy consumption and waiting time of the Nakamoto consensus.

Transaction delays are very small. If network latency is not considered, transactions are on the order of tens to hundreds of milliseconds. If network latency is considered, it has the same order of magnitude as network latency.

However, the classic Byzantine consensus is relatively difficult to scale. Because before the advent of HotStuff, most classic BFT algorithms required all node broadcasts, bringing a level of complexity (including the algorithm described in the Tendermint paper). Adding a large number of nodes can cause network congestion. In addition, the leader node will bear the load of the entire network (the load is extremely uneven), which makes it difficult to expand to thousands of nodes without much performance loss.

SG: Your paper concludes that HotStuff is based on a new framework that builds a bridge between the classic BFT foundation and the blockchain. Help us understand this concept.

TY: Our paper is called “HotStuff: BFT Consensus in the Lens of Blockchain”.

The reason for this description is that its algorithm framework (which can produce multiple derived algorithms) uses a tree/chain structure, much like a blockchain. In addition, similar to the traditional blockchain, a node currently is considered part of the “main chain”, and voting will only vote for new parts that are currently considered to be extended on the main chain. Like a blockchain, if the sidechain is “good” enough, it becomes the new main chain. In the blockchain, this is determined by the length of the chain (the longer wins), and in HotStuff, it is determined by the block that most recently succeeded in getting the majority of the votes.

On the other hand, HotStuff is a member of the classic Byzantine system. Using this algorithm framework can explain PBFT, DLS, Tendermint, Casper and other protocols well, reaching a certain degree of induction and unification. In addition, the biggest difference between the same type of algorithm and the biggest contribution is that there is no special case for HotStuff’s core committee change algorithm; unlike PBFT, which has a “normal” execution flow and a “special” committee change process, HotStuff unifies both. That is, there is no explicit special treatment, and it can be considered as an implicit committee change at each step. This makes it easy to write a basic safety part of a HotStuff-based consensus system. Compared to PBFT’s thousands of lines of code, HotStuff only needs tens to hundreds of lines.

Another feature that is superior to the same type of algorithm is that it is very friendly to engineers. It decouples the logics of correctness and performance guarantee from an algorithmic level. Once the dozens of lines of safety guarantees are completed, the rest of the optimization based on the specific application scenario (including the committee change mechanism, policy) will not touch this part again, making the system always safe.

SG: The classic PBFT algorithm can run in an asynchronous environment such as the Internet, and some optimizations make it faster than previous consensus algorithms. But it also has some problems, such as detecting bad primary nodes and re-selecting new primary nodes (view change) is very inefficient. For example, to reach consensus, PBFT requires an n² level message exchange, which means that each computer must communicate with all other computers on the network. In short, the scalability of PBFT is clearly not enough. What solutions does HotStuff have for these problems?

TY: First, HotStuff reduced the cost of the committee change from square to linear for the first time, which means it has the same level of complexity as Paxos/Raft, a non-byzantine fault-tolerant protocol widely used in the IT industry. In addition, although in theory the protocol such as Tendermint can be reduced to the same complexity by combining digital signatures, these protocols essentially require the maximum possible network delay between blocks and blocks, so that the actual implemented system becomes a synchronized system. The HotStuff idea jumped out of the original framework and proposed a minimalist algorithmic system that makes it easy to break this curse of the classic BFT. After testing, it can defeat the existing fastest traditional BFT implementation with simpler code implementation and lower theoretical complexity and has unlimited potential in commercial systems.

SG: Facebook’s Libra white paper suggests that the Libra blockchain starts with a “permissioned blockchain” and the future goal is to become a permissionless network. From permissioned to permissionless, is there a viable technology path? The difficulty lies in the number (from 100 nodes to tens of thousands of nodes) or is the node becoming malicious (witch attack), not just crashes, omissions, etc?

TY: In theory, any permissioned protocol can be converted into a permissionless protocol. Because classic consensus protocols (whether BFT or non-BFT) can be reconfigured through the consensus itself to add/delete nodes. But because of the potential Sybil attack, this protocol based on accurate member information requires additional PoS or PoW admissions mechanism to open the system.

*End of the interview*

In addition to Facebook, Cypherium also uses the HotStuff consensus. Interestingly, we have provided an alternate solution to the transformation roadmap proposed by Facebook.

Unlike Libra’s future plans to transform into PoS, Cypherium’s main network will be designed as a hybrid consensus mechanism of PoW+HotStuff.

In general, blockchain consensus can be divided into two processes: leader election, packaging and validating blocks. These two processes in classic projects are implemented by the same consensus mechanism. In the first process, Cypherium chose the PoW consensus for selecting leader nodes. Any computing device can be elected as a verification node for Cypherium through mining without relying on trusted third parties. Whenever a miner successfully mines the PoW, the oldest node in the verification committee leaves the committee, and the new miner becomes a verification node, achieving a permanent dynamic rotation. In the second process, the more efficient HotStuff consensus was chosen to package and validate the blocks. Accordingly, Cypherium designed the double-chain architecture of the election chain + transaction chain. Cypherium’s consensus CypherBFT can achieve complete decentralization, transaction confirmation, and support for billions of users.

Appendix: HotStuff explained

In Cypherium, view change, and as a consequence, the leader change — occurs after each QC (QC, a certificate from which leader sending cmd and get at least (n-f) other validator nodes signed).

The 4 phases in the basic HotStuff protocol essentially scattered into four different views in chained HotStuff. See the graph below. The “pre-commit” phase of V1 is carried out along with the “prepare” phase proposed by the leader from V2. The “commit” phase of the proposal from V1 occurs simultaneously with the “prepare” phase proposed by the leader of V3 and the “pre-commit” phase from V2.

In other words, the cmd proposed by leader from V1 would be made final at V4 (the 4th view change along with the new committees).

View has two connections: its parent which is denoted by the black arrow on the above image and its predecessor that carries a valid QC. In the case of v4, v5 and v6 their parent is the same as their predecessors whereas the parent of v8 is v7 yet its predecessor, v6.

Therefore, if a view, say v6, has its parent == predecessor, the v5 and v6 pair constitutes a one-chain. In the case of v5, its parent == predecessor, hence v4, v5, v6 is a two-chain. In v4, the same parent == predecessor, hence, v3,v4,v5,v6 constitutes a three-chain.