An attempt at a simple mathematical model for quantifying bitcoin mining centralization pressure due to a bandwidth bottleneck peoplma Follow Dec 17, 2015 · 5 min read

In the block size debate we frequently hear that large blocks will centralize mining further than it already is. The reasoning is that the larger a block is the longer it will take to propagate to other miners and the more likely it will be that the block is orphaned. High bandwidth miners have an advantage because they can propagate blocks faster. Makes sense, but I haven’t seen anyone try to quantify the effect.

The below formula is derived from the idea that there must be a minimum fee to a transaction that a miner will be willing to include in a block. Whether it gets included or not depends on the size of the transaction in bytes, the fee it pays, and the upload bandwidth of the miner. If the likelihood that including a transaction will orphan the block (called marginal risk) exceeds the fee gained, then the miner would be unwise to include it. I don’t want to get too much into the equation here because I want to get to the results, but read this thread if you want to see how it came about and I’d be happy to answer questions (big thanks to /u/ajtowns for coming up with the framework).

mining model

Or in copyable form: f = r(((s/b)y)/t))/(1-n(((s/b)y/t)))

f = minimum fee a transaction must pay to get included. Also equals the marginal risk of orphan that the miner takes by including it. (BTC)

r = block reward (BTC)

s = size of transaction (Bytes)

b = upload bandwidth of miner (Bytes per second)

y = number of peers miner will propagate the block to

t = network average time to find a block (Seconds)

n = total number of transactions in the block.

This formula, along with a hypothetical set of transactions for a miner to choose from, can be used to predict mining profitability under a diverse set of scenarios. I chose to test a condition where the max block size is 50MB, average transaction size is 500 bytes, block reward is 25BTC, miners propagate to 8 peers and blocks are found every 600 seconds on average. I tested three hypothetical miners that each controlled 33% of the hashrate, the only difference between them was their bandwidth, the first had 8Mbps upload (b = 1,000,000), the second 24Mbps (b = 3,000,000), and the third 72Mbps (b = 9,000,000). Then I calculated f for all different block sizes all the way up to 50MB (which in this model contained 100,000 transactions 500 bytes each).

Then I calculated the orphan rates in percent = 100(ns/b)(y/t)

Then I generated an artificial list of 100,000 transaction fees for the miners to choose from. I chose a normal distribution centered on a fee of 20,000 satoshis with a standard deviation of 10,000 satoshis. Any that were generated as a negative I set to 0 fee. I don’t know what real fee distributions look like, but for my demonstration purposes generating a set was easier than scraping real blockchain data, however that could absolutely be done as well for more realism (and would be super interesting).

Then, through a series of IF statements in Excel and manual iterations, I chose the transactions that would be profitable for each miner to include in their block using the f values obtained in the first graph (transactions that pay a fee < f are not profitable, the orphan risk obtained by including them is larger than the transaction fee gained). I calculated the sum total of fees for each miner and their annual revenue assuming this was an average block and block rewards are 25 BTC, and each miner has 33.3% hashrate.

The 72Mbps miner included 96,448 transactions for a total block reward of 25+20.074 BTC and an annual revenue of 752,009.4 BTC. Expected orphan rate is 7.144% (which is factored in to the annual revenue).

The 24Mbps miner included 90,424 transactions for a total block reward of 25+19.784 BTC and an annual revenue of 675,020.1 BTC. Expected orphan rate is 20.094% (which is factored in to the annual revenue). This is 14.51% less than the high bandwidth miner.

The 8Mbps miner included 39,639 transactions for a total block reward of 25+11.776 BTC and an annual revenue of 530,751.2 BTC. Expected orphan rate is 26.426% (which is factored in to the annual revenue). This is 29.42% less than the high bandwidth miner.

Clearly there is a large advantage to high bandwidth mining in a 50MB max block size scenario with today’s range of home bandwidth. Miners would have to assess whether it is better to centralize (move their equipment) or simply upgrade their internet.

This model offers a useful way to estimate centralization pressure under basically endless mining scenarios. Fee markets could be assessed by setting f equal to expected average fees and calculating optimal b and n. More miners could be added with a range of bandwidth. Max block size can be set by limiting n and setting an average s (as I did here). Even different block times other than 10min can be explored. y is an important variable which scales everything linearly, and I assumed y = 8 for this exercise. Would be really interesting to scrape blocks by miner and see if their bandwidth can be estimated using the minimum transaction fee they include f (presuming they are rational about which transactions are included).

This model does not take into account latency between miners or optimizations that can be made with the fast relay network, weak blocks, thin blocks, invertible bloom lookup tables (IBLTs), or take into account SPV mining, although those parameters could be added without too much trouble and a few assumptions (by someone smarter than me).

I hope this is a useful tool for the community as it tries to come to consensus on the block size issue. Thanks for reading :)

Edit: Here is the Excel spreadsheet I used for this work (hope it makes sense):

https://drive.google.com/file/d/0B6gibYZThF8helFCYnF4NERfQmM/view?usp=sharing