Ethereum 1 dot X: a half-baked roadmap for mainnet improvements

I’m posting this without any prior review or draft feedback from other core devs. This is my personal perspective on what 1.x is about. All mistakes and misrepresentations are my fault.

Summary

Ethereum 1.x is a codename for a comprehensive set of upgrades to the Ethereum mainnet intended for near-term adoption. The 1.x set of improvements will introduce major, breaking changes to the mainnet, while 2.0 (aka Serenity) undergoes prototyping and development in parallel. The plan for 1.x encompasses three primary goals: (1) mainnet scalability boost by increasing the tx/s throughput, achieved with client optimizations that will enable raising the block gas limit substantially; (2) ensure that operating a full node will be sustainable by reducing and capping the disk space requirements with “storage rent”; (3) improved developer experience with VM upgrades including EVM 1.5 and Ewasm.

Introduction

The “Ethereum 1.x” idea was born out of discussions among core devs during Devcon4. Previous to discussions about 1.x, the roadmap for Ethereum 1.0 was minimal with relatively conservative changes having been proposed for mainnet hard forks (e.g. Byzantium and Constantinople). Before and after Byzantium (October 2017), Casper-FFG was being developed as a drastic mainnet change which would introduce hybrid PoW-PoS block rewards. By June 2018, Casper-FFG was deprecated, and PoS research efforts pivoted to development of a “beacon chain” which would be launched as a new chain separate from the Ethereum 1.0 mainnet. This pivot left 1.0 client developers disoriented. As the longer timeline for 2.0 became apparent, we began to ask, what do we do in the meantime with the mainnet?

One option for 1.0 client maintainers is to coast along with conservative, easy changes to the mainnet, and not to consider any major changes (leaving them as features slated for 2.0). An alternative option is to consider introducing drastic, breaking changes on the 1.0 mainnet, while separate teams focus on 2.0 R&D. This latter option is the 1.x plan.

Formulating a plan for 1.x

Before announcing the 1.x plan, some core devs wanted time to flesh out detailed proposals, and to gather concrete data to answer pertinent questions (such as, what is the immediate scalability boost we can expect after some easy client optimizations? 2x, 5x, or more?). But the desire for working groups to have an opportunity to coordinate draft EIPs in private before announcing the plan conflicts with the desire to openly discuss changes under consideration at the earliest possible stage with the broader community. So, although it would have been nice for core devs to announce a solid step-by-step plan for 1.x, it would also be nice to formulate a plan with open working groups in an inclusive and transparent process from the beginning.

The downside of an inclusive and transparent process from the beginning is that the initial presentation is only a half-baked plan. Because a half-baked plan cannot answer all the questions, this risks stirring up a confused narrative, with controversy and pushback from other devs and the community. As the 1.x plan will introduce breaking changes on the mainnet, it is expected to be controversial, so there is reluctance to broadcast a half-blaked plan.

The ability of core devs to pursue a 1.x plan on an aggressive timeline is also uncertain. The improvements are both technically and politically ambitious and will take great effort to execute; the motivation to press forward could be sapped by early controversy and resistance. The easier option is to avoid controversy and reserve ambitious ideas for 2.0. Getting drastic changes adopted on the mainnet will be challenging.

The rest of this post will outline the three main goals of the 1.x plan. The first and second (scalability and sustainability) are arguably interrelated, while the third (VM upgrades) is independent.

1. Client optimizations for a scalability boost

The first goal is to boost transaction throughput on the mainnet. Transaction throughput is determined by the block gas limit, which is currently around 8 million. Miners vote with each block to either raise or reduce the block gas limit. If the gas limit is raised too high, then the network uncle rate increases as an unintended side effect. A high uncle rate is bad because it results in mining pool centralization (it is mainly small pools that suffer from high uncle rates, leaving them with lower revenues and unable to compete against larger mining pools with lower uncle rates). Thus, miners cannot naively raise the gas limit without sacrificing a diverse set of multiple competing mining pools.

The good news is that a client optimization has been recently discovered which is likely to enable a substantial increase to the block gas limit while maintaining a low uncle rate. The optimization is a fix to the way Parity relays blocks (discovered by Alexey Akhunov of turbo-geth fame). Currently Parity does full verification of block PoW and transaction processing, before relaying a block. The optimization is to only verify the PoW and then start relaying the block, while processing the transactions. This optimization might greatly reduce network uncle rates and could enable miners to raise the block gas limit substantially (note an alternative idea: rather than raising the block gas limit by 2x, computational opcodes could be repriced to 1/2).

How much can we raise the block gas limit with this optimization? We don’t know yet, and we don’t want to get excited prematurely. Core devs are hoping to study this question with network simulations and more data collection, but the answer depends on complex factors which have been understudied (network topology and propagation delays between full nodes). Aside from this one fix, there are further “low-hanging” optimizations to block relaying that could also be done.

Beyond low-hanging optimizations, more drastic changes for mainnet throughput increases are also being studied. One approach is parallel transaction processing, picking up where an old EIP left off. Another approach to achieving a big scalability boost on the mainnet, mentioned long ago in the Sharding FAQ, is a change to the PoW protocol: “Bitcoin-NG’s design can … increase the scalability of transaction capacity by a constant factor of perhaps 5-50x… [the approach] is not mutually exclusive with sharding, and the two can certainly be implemented at the same time.”

So there are easy optimizations that might yield an immediate (totally wild guess, 2x-5x) throughput boost on the mainnet. And with more comprehensive protocol changes, maybe a 50x boost on the mainnet (not my number! its in the sharding FAQ) could be achieved.

But, a 2x-5x boost in throughput would make the current problems with mainnet 2x-5x worse. The biggest problem is growth in disk space, and if we’re going to boost the mainnet throughput then the disk space problem must be solved first.

2. Reducing the disk space for a sustainable network

A long-term solution for reducing disk space, i.e. storage rent, is the most controversial part of the 1.x plan. There is much debate and differing opinions on how necessary this is. On one end of the opinion spectrum, some 1.0 client maintainers believe that the state size is already growing too fast, and that even without any boost in throughput, a drastic change needs to be proposed and adopted. These core devs argue that at best, the current Ethereum mainnet can sustain growth for three more years. If some drastic breaking changes are not made before then to reduce the disk space burden, then Ethereum as we know it will not survive.

At the other end of the spectrum are researchers whose efforts are focused on scaling Ethereum by launching 2.0 as soon as possible. They argue that new hard drives can accomodate the current rate of state growth on the 1.0 mainnet, until 2.0 is launched and users migrate from 1.0 contracts to new contracts on 2.0. They also argue that introducing breaking changes on the mainnet would violate the behavioral expectations that users have about contracts deployed on 1.0, and that the 1.0 network would work just fine with a state size of 70 gigs in three years (the current state size is around 7 gigs, last I checked). Furthermore, introducing a rent mechanism on Ethereum 1.0 could be confusing to users, as it will likely be different from the rent mechanism introduced on 2.0.

An alternative to storage rent is stateless clients, but for stateless clients to be practical the state trie format would need to be changed to a format optimized for the stateless paradigm (i.e., clients would need to switch from the current hexary trie to a binary or sparse trie). Discussions among core devs lean toward the opinion that switching from stateful to stateless would be a huge change to 1.0 clients and much more complicated to implement. The simpler path, which can be achieved on a more aggressive timeline, is to keep the current stateful hexary patricia trie and add on storage rent.

The good news is that there are some easy, non-controversial changes that can be adopted immediately to reduce required disk space. These changes were proposed by Péter Szilágyi (of go-ethereum fame) as the first two of a three-point plan to reduce disk space (in brief: 1. delete past blocks 2. delete past logs 3. delete state, i.e. storage rent). Currently a geth node sync’d to the chain downloads over 100gb of data, but most of that data is past blocks and past logs. The actual account state is only a fraction of that total data. To be clear, past blocks and past logs would of course continue to be stored somewhere and be widely available, but they would not be stored by common full nodes which dominate the network. Full nodes would instead only store some recent history of blocks and logs, perhaps several months or so of data.

The two easy changes (delete past blocks and delete past logs) would only break some dapps that expect a full node to index and query all past log events. These dapps would stop working with mere full nodes (instead they would require the user to run a more space-intensive archive style node, or to query a log indexing service). Sync’ing for the majority of users would become fast and painless (like in the early days, when the Ethereum mainnet was young and lightweight). But it would be a temporary fix, and sync’ing would gradually become slow and heavy again, as the account state grows and grows.

The solution to a growing account state is storage rent.

Among potential storage rent proposals, they differ in terms of friendliness to users, and implementation complexity (friendliness to core devs). The simplest implementations are not friendly to users. For instance, it is much simpler to implement a rent mechanism that simply deletes accounts which do not pay rent, and does not offer users any way to un-delete or “resurrect” their accounts. In contrast, a rent mechanism where users can later resurrect accounts that didn’t pay rent is friendlier to users, but more complex to implement.

Another issue with rent is incentive issues around contracts with multiple users. For instance a token contract has many users who hold tokens, but a simple rent mechanism would require the contract to pay a rent fee. No single user is incentivized to pay rent for the contract, rather each token holder’s incentive is to let some other user pay the contract’s rent fee. Solving this incentive problem would require a major change to the ownership model around contract storage.

The storage rent proposal is the hardest part of the 1.x plan. It is politically controversial, as it will introduce breaking changes to the mainnet with new modes for user and developer experience. And it is technically complex to implement, especially to provide a user-friendly mechanism. The goal is to flesh out a detailed proposal that as many people as possible will be satisfied with (from core developers, to dapp developers, to dapp users).

3. Improved developer experience with VM upgrades

The third goal, upgrading the VM, is fairly independent of the first two. One proposal for upgrading the EVM is EIP 615. This EIP is also known as “EVM 1.5” because it was proposed as a near-term improvement to the EVM, in between the longer-term move toward Ewasm (aka “EVM 2.0”).

Ewasm was originally designed to be backwards-compatible with the EVM (i.e. so that Ewasm contracts could interoperate with EVM contracts), for adoption on the mainnet. Later, Casper-FFG was deprecated and the PoS roadmap pivoted to Ethereum 2.0 phases, with a beacon chain in Phase 0/1, and an execution engine based on Ewasm was proposed for Phase 2. But as the execution engine on 2.0 would be on a separate chain rather than 1.0 main chain, there is no need for the 2.0 Ewasm to be backwards-compatible with EVM. This means the “Ewasm 2.0” design is an open question, and could differ substantially from “Ewasm 1.0” (i.e. the current Ewasm design which is backwards-compatible with EVM).

The 1.x plan for Ewasm means pursuing the original goal: mainnet adoption of the backwards-compatible Ewasm version alongside EVM. A multi-step roadmap for introducing Ewasm on the mainnet will be detailed in proposals to come.

Conclusion: No 1.x roadmap yet

The above plan for 1.x is a half-baked outline. At this time, the pertinent questions cannot be answered. Studies need to be performed to gather data on the potential degree of mainnet scalability improvements in the near-term. And the breaking changes required to make operation of full nodes sustainable in the mid and long-term need to be written up and published as detailed proposals for community consideration.