Looking back at TREZOR’s Bitcoin Cash Integration

What went wrong and what went right?

When we first promised that we would develop BCH integration in TREZOR Wallet, we never imagined that it would be such a difficult, stressful and lengthy process. After all, Bitcoin Cash was supposed to be just a fork of Bitcoin along with several changes. We followed these modifications and implemented all documented changes: released a new version of TREZOR firmware, added the new currency into TREZOR Wallet and deployed a version of Bitcore server, a backend, tailored for BCH. All of this was done in a very short time since we had less than a week to work on this rushed project. We hoped that everything would go fine; after all, we followed the documentation. It did not.

In this blog post, we would like to explain what we did not manage to say during the past few days. We would like to investigate what went wrong, why things did not work and what we could do better in the future.

Decision to develop BCH integration

The decision to support BCH in the first place was straightforward, maybe it came about a bit too quickly. We promised that if BCH implements the necessary two-way replay protection, we will develop TREZOR to support the currency. Our motivation was to convince BCH developers that replay protection was in the interest of both BTC and BCH, and if they did that, they would have another wallet in their ecosystem. In the end, they did make replay protection obligatory, so we had to honor our word.

Firmware development

The first part, new firmware development, went through without a glitch. BCH support was added, along with other features that were in the pipeline. (They deserve a separate article and will get their coverage when the firmware is released in TREZOR Wallet.) Firmware was well tested and cleared for release. In short, there were no problems with firmware version 1.5.1.

It would definitely be amiss if we did not mention that this would not have happened so quickly without the help of Jochen and Saleem.

BCH Bitcore development

The Bitcore development is a completely different story, though it did not look like it would become a nightmare. But first, let’s get to know what are dealing with here:

Bitcore (do not confuse with Bitcoin Core) is a Bitcoin server with additional services. Insight is a blockchain explorer, usually installed together with Bitcore. We use Bitcore as a backend for all cryptocurrencies supported in TREZOR Wallet.

BitcoinABC is a full node implementation of the BCH standard.

A part of the Bitcore architecture is a patchset to Bitcoin Core code. In other words, in order to run Bitcore on top of any coin, we need to change the source code of the program, which implements the desired coin, to add transaction search by address. In this case we needed to modify BitcoinABC.

As BitcoinABC itself is a fork of Bitcoin Core, this should have been easy. However, BitcoinABC made many changes to their code, including style changes. This makes comparing code and implementing our support a very time-intensive task, as code is changed on nearly every line.

BitcoinABC does not have a public testnet.

Last release of BitcoinABC occurred a few days before the fork date, leaving us with no possibility to do extensive testing.

While there were issues with the code at first, as it was completely reformatted, with hanchon’s help we managed to get Bitcore running with BitcoinABC. The BCH Bitcore ran and synchronized without issues. Git tests passed, the node connected to the network, everything testable worked.

The node answered normally before the fork, which some of you saw, as you noticed our BCH Claim Tool going online hours before the fork. It seemed like there would be no problem on August 1st.

On the day of the fork, however, we started noticing odd things happening. During the maintenance of the BCH Bitcore server, we witnessed how the server started losing addresses and transactions. Addresses had negative balances, which should not be technically possible. This caused the initial delay in deployment, as we could not release a Wallet that would use corrupted data.

During this time we decided to release the BCH Claim Tool only; allowing you to claim your coins to an external wallet. The Claim Tool also depends on the Bitcore server, but it worked well for sending transactions, as it took the BCH balance from the current BTC balance as a workaround.

While we were debugging the problem, we noticed that our Bitcore server was not the only Insight instance where transaction details started disappearing. It happened to all servers running on Bitcore, like Blockdozer.

The bug

Why were transactions disappearing from our Bitcore? We found the bug after a few hours of debugging.

When the Bitcore server restarted, for some reason it deleted address index for some random blocks. However, when that concrete block was manually invalidated and reindexed, the block index was correct, meaning this was a non-deterministic bug.

Investigating the bug

Every bitcoin node does a “sanity check” of its block database on every start. During this process, several blocks are disconnected from the blockchain in memory and then reconnected, in order to check the state and compare it. As this happens only in memory, it has no side-effect even if the blocks are not reconnected. Bitcore patches change these functions, adding another process: reading and saving addresses to a separate index.

However, as BitcoinABC back-ported some changes from Bitcoin Core’s master branch, they subtly changed the logic of a method used in this process. This caused the method to ignore the state of the process, whether it was in “initial check” or “standard run”, to run in memory or on the disk. Instead, it ran the on-disk method (the latter) on every restart, and removed information about addresses from the last six blocks. This caused the database to be inconsistent, ending with some transactions disappearing and some addresses showing up with negative balances.

Fixing the bug

Apart from compulsory code rewriting and compiling, we had to invalidate all known bad blocks in our blockchain. During the fixing time, we had to take down the Claim Tool as well, as the backend was taken offline. (This was on August 2nd.)

Unfortunately, this process corrupted the entire database for BCH. As Bitcoin’s chain was already too far ahead to use, and a reorganization would be rejected by the server, we had to copy a database and reindex everything from the beginning.

However, the reindexing process also did not manage to finish, as the process slowed down to crawl. While initially the ETA was 5–7 hours for the entire operation, after 12 hours of reindexing, the updated estimate predicted another 12 hours would be needed. We couldn’t wait for another 12 hours. In the end, we copied indexes from our bitcoin node, invalidated all blocks until the fork block, and started BitcoinABC.

Moreover, to make sure we have a backup of the database in case of another corruption, we copied 300 GB of data across servers. This way, if the backend hits a bug again, we will be able to restore it more quickly. After making a backup, we fired up the backend and tested a few transactions. As they went through, we published the news.

After running into a problem after problem, we finally managed to get BCH in TREZOR Wallet working. August 3rd at 20:30 UTC.

Lessons Learned

It is worth noting that we did not write this up to point fingers at someone. It is not anyone’s fault for backporting code from a newer version of Bitcoin Core. Why things went wrong is not faulty code. It is the lack of time to test applications, to discover and fix mistakes. If more time was available, we would have caught this bug during testing.

There are many lessons to take from this hard fork. We will name the most important ones here: