Scalability

One common concern about Ethereum is the issue of scalability. Like Bitcoin, Ethereum suffers from the flaw

that every transaction needs to be processed by every node in the network. With Bitcoin, the size of the current

blockchain rests at about 20 GB, growing by about 1 MB per hour. If the Bitcoin network were to process Visa's

2000 transactions per second, it would grow by 1 MB per three seconds (1 GB per hour, 8 TB per year).

Ethereum is likely to suffer a similar growth pattern, worsened by the fact that there will be many applications

on top of the Ethereum blockchain instead of just a currency as is the case with Bitcoin, but ameliorated by the

fact that Ethereum full nodes need to store just the state instead of the entire blockchain history.

The problem with such a large blo ckchain size is centralization risk. If the blockchain size increases to, say,

100 TB, then the likely scenario would be that only a very small number of large businesses would run full

nodes, with all regular users using light SPV nodes. In such a situation, there arises the potential concern that

the full nodes could band together and all agree to cheat in some profitable fashion (eg. change the block

reward, give themselves BTC). Light nodes would have no way of detecting this immediately. Of course, at

least one honest full node would likely exist, and after a few hours information about the fraud would trickle out

through channels like Reddit, but at that point it would be too late: it would be up to the ordinary users to

organize an effort to blacklist the given blocks, a massive and likely infeasible coordination problem on a

similar scale as that of pulling off a successful 51% attack. In the case of Bitcoin, this is currently a problem,

but there exists a blockchain modification suggested by Peter Todd which will alleviate this issue.

In the near term, Ethereum will use two additional strategies to cope with this problem. First, because of the

blockchain-based mining algorithms, at least every miner will be forced to be a full node, creating a lower

bound on the number of fu ll nodes. Second and more importantly, however, we will include an intermediate

state tree root in the blockchain after processing each transaction. Even if block validation is centralized, as

long as one honest verifying node exists, the centralization problem can be circumvented via a verification

protocol. If a miner publishes an invalid block, that block must either be badly formatted, or the state S[n] is

incorrect. Since S[0] is known to be correct, there must be some first state S[i] that is incorrect where S[i-1] is

correct. The verifying node would provide the index i, along with a "proof of invalidity" consisting of the subset

of Patricia tree nodes needing to process APPLY(S[i-1],TX[i]) -> S[i]. Nodes would be able to use those nodes to

run that part of the computation, and see that the S[i] generated does not match the S[i] provided.

Another, more sophisticated, attack would involve the malicious miners publishing incomplete blocks, so the

full information does not even exist to determine whether or not blocks are valid. The solution to this is a

challenge-response protocol: verification nodes issue "challenges" in the form of target transaction indices,

and upon receiving a node a light node treats the block as untrusted until another node, whether the miner or

another verifier, provides a subset of Patricia nodes as a proof of validity.

Page 33

ethereum.org