An Analytic Model of the Performance of a Forked Bitcoin Blockchain with Two Block Size Limits tl121 Follow May 22, 2017 · 7 min read

by tl121

May 25, 2017 / version 2

Problem Statement

If a majority of hash power wishes to extend the Bitcoin blockchain with a fork starting with a large block that a minority of hash power won’t accept, there is a risk that a run of bad luck may cause the large block fork to be swallowed up by a longer chain of small blocks created by the minority fork. This can happen because large block nodes consider small blocks as valid and will switch to a longer small block chain. If this happens and the large block nodes persist then this situation will repeat until, eventually, a large block fork is created that is not superseded. Until this happens, the operation of the large block chain will be unreliable and users with software that accepts blocks on the longest chain will experience unreliable transaction confirmation, including the potential that transactions with many confirmations may be reversed. In addition, mining nodes attempting to create or extend the large block fork will incur large orphan costs.

It is clear that the risk of unreliable operation depends on the strength of the majority hash power behind the large block fork. This paper models this situation and provides simple analytic formulas based on the relative hash rate of both forks, yielding the probability that an attempt by the majority fork succeeds by permanently outrunning the minority fork. In addition, the paper gives formulas for the expected number of unsuccessful attempts, and the expected orphan cost incurred by nodes mining on the large block fork.

Assumptions

The following assumptions are made to create a simple analytical model.

Network propagation delay and verification delay are assumed to be very low and accordingly, no orphan blocks are considered other than those occurring as the result of a lucky small block fork overtaking the large block fork. There are only two types of nodes, small block nodes enforcing the present single small block size limit and large block nodes enforcing a single new larger block size limit. The fraction of total hash power assigned to each type of node remains constant during the period being modeled. The large block hash power exceeds the small block hash power. All nodes build on the tip of the longest fork that they consider to be valid. Valid is determined according to the individual block size limit a node is enforcing. Thus small block nodes build only on the small block fork, while large block nodes build on the tip of whichever fork is longest. In the case where both forks are of equal length, large block nodes build on whichever tip block they have received first. The difficulty remains constant during the period being modeled. This implies that a chain length in integer blocks suffices to determine which fork is longer. In particular, the generation of a single block by one node can not transition a shorter fork to a longer fork. The most that it can accomplish is to transition a shorter fork to an equal length fork. Any time the blockchain is not forked, there is a continual supply of transactions permitting any large block node to emit a large block and all large block nodes will do this if they have the opportunity.

Model of Network Operation

With these assumptions the network can be modeled by the following state diagram.

In this diagram State 0 represents the unforked starting state of the network as well as renewal points that recur after each unsuccessful attempt to create a large block fork. States k, k > 0 represents the case where the blockchain is forked, with the large block fork containing k - 1 more blocks than the small block fork. State 1 is special. This indicates that the chain is forked, but that both forks are of equal length. The large block nodes are mining on the large block fork because they had previously seen a block on the large block fork when it was longer. State 1 is entered when a block is mined that causes the small block fork to catch up. By assumptions 1 and 4, the small block fork has to become longer than the large block fork to destroy the large block fork. This is indicated by the arc from state 1 to state 0, indicating that the entire large block fork has been orphaned and the blockchain is no longer forked.

p is the fraction of total hash power held by large block nodes.

q is the fraction of total hash power held by small block nodes.

p + q = 1

0 < q < p < 1

Analysis

We view the history as a sequence of epochs. Each epoch is an attempt starting at state 0 to fork a single chain. We call the epoch successful if it does not end in a return to state 0, i.e. it goes on infinitely long. The probability of success can be estimated according to the well known formula for the gambler’s ruin given by equation 2.4 in section XIV.2 of Feller [1], taking the limit as the variable a goes to infinity. In the referenced formula, the initial stake of the player is 2, which corresponds to the fact that once a large block fork is created the small block fork has to create one block to catch up and a second block to cause the fork attempt to fail.

q2 = (q/p)², which is the probability of failure.

p2 = 1 - (q/p)² = (p - q)/p², which is the probability of success.

Next we calculate the expected number of unsuccessful attempts, based on p2 and q2.

P (X = k) = (q2)^k (p2)

According to Feller, section IX.3 c), Negative binomial distribution, after substituting p2 and q2 for p and q,

E(X) = (q/p)² / ( 1 - (q/p)² ) = 1 / ( (p/q)² - 1 )

Next, we calculate the expected number of blocks (both large and small) that are found during an unsuccessful attempt to build a stable large block chain. Here we use a form of the gambler’s ruin problem where the gambler is the small block player who wins when gaining two blocks against the persistent large block player. Here the roles of p and q are reversed. Using equation 3.4 of Feller XIV.3 and taking the limit as the variable a goes to infinity, we get the following formula for the duration of an unsuccessful epoch:

D2 = 2 / (p - q)

Next we calculate the expected number of orphan blocks on the large block chain at each unsuccessful attempt. First there is the starting block transitioning from state 0 to state 2, not counted in D2. Then there is a run of states from state 2 back to state 2 of length D2–2, followed by two transitions back to state 0. There are (D2–2) / 2 large blocks mined in this run. This yields the expected number of large blocks orphaned in an unsuccessful run to be

E(Y) = 1 + (D2 -2)/2 = D2/2 = 1 / (p - q)

Finally we put the pieces together to get the expected number of large blocks blocks orphaned during during the unsuccessful attempts. This is the product of the expected number of unsuccessful attempts times the expected number of large blocks orphaned during each unsuccessful attempt, because of the regeneration point at state 0 each time an attempt fails and the fact that the individual mining events constitute Bernoulli trials.

E[B] = E[X]E[Y] = ( 1 / ( (p/q)²-1) ) ( 1 / (p - q)² ) = q² / (p - q)²

Numerical Examples

Acknowledgments

I would like to thank Dr. Peter Rizun for reviewing an earlier draft of this paper.

Change History

Version 2. Changed references to Feller’s book to reflect the second edition, rather than the third edition which is no longer available for free download on the web. References to page numbers were changed to section numbers for consistency with both editions.

References

[1] William Feller, An Introduction to Probability Theory and Its Applications, volume I, Second Edition, John Wiley & Sons, New York. Available on the web for free download at https://archive.org/details/AnIntroductionToProbabilityTheoryAndItsApplicationsVolume1

Appendix

A Python program was created to validate the model which produced results consistent with the model for values of p >= 0.55.