This is only concerned with scaling on layer 1 of the network. Second layer scaling solutions is another topic entirely.

Scaling, one of the more popular words in the greater blockchain buzzword bingo that has a lot of people demanding it without many understanding the intricacies involved in successfully implementing a lasting solution. Scaling is not as simple as increasing block size and quickening block generation. Transactions have weight, require computation, need to be communicated through all nodes and the blockchains they are included in require storage in their ever growing state. Scalability is therefore heavily dependent on how the specific blockchain handles transactions and node communication.

In general, one of the core concepts of cryptocurrency, trustless verification, causes the most trouble for scaling. In all blockchain protocols, each node stores all states with account balances, contract code, and entire transaction history, in order to trustlessly verify a transaction’s validity for each transaction in the network. This provides a large amount of security but limits scalability to not being able to process more transactions than any single node in the network can. This is the biggest variable in setting the ceiling for transaction volume at ~3–7 transactions per second for Bitcoin or 7–15 TPS for Ethereum. Visa processes ~2,000 transactions per second and an IoT network with an estimated 50 billion devices could produce several times that. That’s a large gap.

So if we want to expand into the future and create the real world usable networks that we all envision them to be, we have to find other ways to scale. Second layer solutions that stretch each of those transactions to the settlement of larger groups of transactions, like Lightning, are a valid path and one that even Satoshi conceived for Bitcoin.* But more can be done to the first layer to establish it as more able to scale with higher throughput to push out the need for more extreme first or second layer solutions or make them a natural evolution within the network.

Defining the Problem

We are going to talk about the problem in terms of Bitcoin pre-SegWit as this complicates things and turns the discussion from easy to understand block size to less easy, block weight

What prevents a network from processing more transactions? There are two types of constraints: physical resource constraints and software constraints. Data communication is limited by the speed of light, bandwidth determines how much data can be sent in time, CPUs limit how much processing can be accomplished, blockchains grow and after all of that, the network has to remain secure and resilient against attack.

At the very base of the problem, we have network latency, how long it takes for data to travel. This is dependent on the size of the transaction but in most global networks this is about 2–3 seconds. Currently, transactions are about 500 bytes for Bitcoin and 150 bytes for Ethereum so if we want to compete with Visa the network must be able to communicate at 8 Megabits per second. This is relatively common Internet speeds for today.

At this theoretical rate of transactions that means the nodes in the network must be able to process 500 bytes x 2,000 tps = 1 MB amount of transactions per second. Processing a transaction involves hashing and ECDSA signature verifications. RIPEMD-160 and SHA256 (both hash algorithms) run at about 100 megabytes per second, so 2,000 transactions could be processed in about 10 milliseconds, so fast enough that we don’t need to worry about it.

A Bitcoin node does do other things above verifying transactions, but this process takes up the majority of the processing power required to run a node.

Ethereum is a little different in that it doesn’t have a traditional block size but a per block gas limit. Gas is the payment unit used to pay for the computations required to process the transaction. This is a consistently reevaluated value based on the current processing, storage and bandwidth conditions of the network. This is important because not every transaction is just used for transferring an asset but may be used to signal a contract or carry data to it which requires the network to do some level of extra computation.

The only resource concern then that increasing transaction rate brings is increased requirements for node storage which may lead to a centralization of nodes based on higher cost of equipment. If the rate of transactions increases by way of block size increase or block generation rate, the blockchain would grow at a corresponding rate. Currently, the Bitcoin blockchain grows at a max rate of about 1 GB a week (max rate of about 4 GB for SegWit enabled Bitcoin). If the block size is doubled, that rate would be 2 GB a week, tripled it would be 3 GB a week and so on. (Imagine if Bitcoin Cash, with its 32 MB blocks, were filled. The blockchain would grow at a rate of 4.5 GB per day.) If the block production rate increases, this would have the same effect. This poses a node centralization risk if this is done too early putting the node storage requirements out of financial reach for common people. So while it is something to keep in mind in terms of the implementation timing of certain scaling techniques in relation to technology cost, it is not a base concern.

As we saw above, all the operations of processing a transaction, from the propagating through the network, to the hashing and verifying, do not pose any real physical resource constraints on transaction throughput in our current networks. What does currently stop us from going faster is the size of the transactions, the size of a block (how many transactions you can fit in the block), how often a block gets added to the chain and the mechanism by which the nodes collaborate to add transactions to the chain.

Anatomy of a Transaction

Bitcoin and Ethereum transactions are fundamentally different but the data within is relatively similar. In a UTXO framework like Bitcoin, the transaction has two sections, the input data, and the output data. In an account based network like Ethereum, the transaction is not dependent on proving you possess the private key to spend an output but in having enough balance in your wallet for that transaction. This makes Ethereum transactions quite a bit smaller than Bitcoin transactions and fairly consistent in their size. Bitcoin transaction size is heavily dependent on the number of inputs and outputs included in the transaction.

In the most simplified Bitcoin transaction, pay-to-public-key-hash, where you are sending as an output the entirety of an input with no change address required, there are two main chunks of data, the input and the output. For each of those we will need the signature, the public key, the previous unspent output, the new output public key and the amount (with some little tidbits here or there). This looks like this (though not this nice):

<Version>

01000000

<Input No.>

01

<Unspent TX Output>

be66e10da854e7aea9338c1f91cd489768d1d6d7189f586d7a3613f2a24d5396

<Output Index>

00000000 <scriptSig:scriptPubKey>

19 76 a9 14 dd6cce9f255a8cc17bda8ba0373df8e861cb866e 88 ac <Sequence>

ffffffff <Output No.>

01

<Tx Amount in Little Endian>

23ce010000000000

<Output Script>

19 76 a9 14 a2fd2e039a86dbcf0e1a664729e09e8007f89510 88 ac <Lock Time>

00000000

<Hash Code Type>

01000000

and without the nice titles and spacing:

01000000 01 be66e10da854e7aea9338c1f91cd489768d1d6d7189f586d7a3613f2a24d5396 00000000 19 76 a9 14 dd6cce9f255a8cc17bda8ba0373df8e861cb866e 88 ac ffffffff 01 23ce010000000000 19 76 a9 14 a2fd2e039a86dbcf0e1a664729e09e8007f89510 88 ac 00000000 01000000

Once all of that is set, we have to sign the transaction that shows that we own the address of the output included in this transaction. This consists of a DER encoded signature generated with the private key and the corresponding DER encoded public key. We then use this to replace the <scriptSig:scriptPubKey> portion of the input, and completes our transaction for .00128307 BTC:

01000000

01

be66e10da854e7aea9338c1f91cd489768d1d6d7189f586d7a3613f2a24d5396 00000000 8c 49 3046022100cf4d7571dd47a4d47f5cb767d54d6702530a3555726b27b6ac56117f5e7808fe0221008cbb42233bb04d7f28a715cf7c938e238afde90207e9d103dd9018e12cb7180e

01 41 042daa93315eebbe2cb9b5c3505df4c6fb6caca8b756786098567550d4820c09db988fe9997d049d687292f815ccd6e7fb5c1b1a91137999818d17c73d0f80aef9 ffffffff

01

23ce010000000000

19 76 a9 14 a2fd2e039a86dbcf0e1a664729e09e8007f89510 88 ac

00000000

Or:

01000000 01 be66e10da854e7aea9338c1f91cd489768d1d6d7189f586d7a3613f2a24d5396 00000000 8c 49 3046022100cf4d7571dd47a4d47f5cb767d54d6702530a3555726b27b6ac56117f5e7808fe0221008cbb42233bb04d7f28a715cf7c938e238afde90207e9d103dd9018e 2cb7180e 01 41 042daa93315eebbe2cb9b5c3505df4c6fb6caca8b756786098567550d4820c09db988fe9997d049d687292f815ccd6e7fb5c1b1a91137999818d17c73d0f80aef9 ffffffff 01 23ce010000000000 19 76 a9 14 a2fd2e039a86dbcf0e1a664729e09e8007f89510 88 ac 00000000

You can see that once the transaction is signed, the signature is about 65% of the transaction data. This would be a good place to work in order to reduce transaction weight. This is in fact what SegWit does, it segregates the witness (signature) from the transaction data, reducing the transaction weight by 65%, by moving it to a new merkle tree in the coinbase transaction. This along with changing how the block size is evaluated, effectively quadrupled the size of the block.

This transaction format also grows rapidly with the number of inputs and outputs. If the transaction requires two inputs (to cover the amount of BTC sent) and two outputs (one for a change address), as is most common, the transaction size doubles. Larger transactions of larger sums including a large amount of smaller inputs would then become quite heavy. In fact, this is an effective way to attack the network and clog blocks. This was, arguably, what occurred during the scaling debate where Bitcoin Cash tried to overthrow Bitcoin for the top spot. Transactions were sent for relatively small amounts containing many inputs and therefore creating transactions so heavy the number of transactions that could be included in a block was significantly reduced. In several instances, this took the average number of transactions in a block from over 2,000 to under 100. This is currently still an issue but one that has a proposed solution.

Schnorr signatures is a proposed change to Bitcoin’s current Elliptic Curve Digital Signature Algorithm that essentially allows the combination of keys to provide an aggregate signature for many inputs. This would reduce the need to provide a signature for every input to a combined signature for all inputs, thereby reducing the weight of the transaction and nullifying the attack vector.