CPUs, GPU, ASICS?

In January of 2018, the whitepaper for X16R was published with the suggestion that it was an ASIC resistant algorithm. It has been about a year and a half, and although the hash rate and profitability do not suggest there are ASICs on the Ravencoin network, there is some evidence that there might be an ASIC being built, or possibly deployed. I’ve been told privately that FusionSilicon tried to create an ASIC and gave up. There’s also a paper suggesting that one mining pool has a high hash rate.

So, now what?

Well, here are some of my thoughts… Yes, I know you didn’t ask for my thoughts.

An ASIC isn’t evil but as I stated in the original whitepaper, “The unfortunate side-effect of this transition to ASIC hardware is the centralization of mining.” This is made even truer by the fact that we’ve suggested we don’t want ASICs on the Ravencoin network, so they will likely be running on a hidden pool, and not be sold to everyone.

An ASIC is just a chip that is optimized for one specific purpose. This is right in the ASIC name as it stands for Application Specific Integrated Circuit. The specific application is mining with a particular algorithm like x16r, and an Integrated Circuit is a microchip. ASICs can be faster because they do one thing well, and can ignore everything else. An analogy is an Olympic marathon runner whose body composition is built for speed and distance only vs. a decathlon athlete that might be built differently to be able to do pretty well on ten different athletic skills. The marathon runner might not be able to shot put very well because the light, lithe, lean frame needed to win a marathon isn’t well suited for shot put. Those types of tradeoffs exist when making a fast mining chip. Therefore a mining algorithm that forces a chip to do many varied tasks is considered ASIC-resistant, and this runs on a continuum. So what if you combined marathon running with weight lifting with Sudoku solving? It gets harder to optimize for one without sacrificing something for the other tasks. The muscle-bound weight lifter has trouble quickly running 26.22 miles, while the light, lithe marathon frame has trouble lifting hundreds of lbs. Throw in the additional uncorrelated task of solving Sudoku, and a custom athlete is trickier to find/build.

That is the premise of ASIC-hard algos. There are several of these:

RandomX — A leading candidate for the next Monero hard fork.

CNv4 (CryptoNight R) aka CN-V9 — The current Monero algo. Favors CPUs. GPUs possible. ASICs difficult.

ProgPOW — An algorithm designed to favor GPUs over ASICs

Is centralization bad? I would argue that yes, in the context of a crypto-currency, you do not want centralization. If a government or small collection of governments can shut down the mining, or request that certain transactions not be allowed, then a lot of the benefits of a crypto-currency, or crypto-asset platform have been lost. If, on the other hand, CPUs everywhere are mining, then there is no way to make such a request.

I do not think ASICs are dominating the Ravencoin network. I believe this because I’ve done back-of-a-napkin math to evaluate the profitability of a 1080 Ti mining rig we have at the office. At the time of the calculation a few weeks ago, it is making about $300 and costing about $65 in electricity. We do not know how much faster an ASIC can be than a GPU for x16r. But if there were a lot of ASICs that run thousands of times faster, then GPUs would be wildly unprofitable, and that isn’t the case. Our rig is still running a profit.

In the Ravencoin Whitepaper, it said, “Create a platform like Bitcoin with a new mining algorithm, x16r, intended to prevent immediate dominance by mining pools, and future dominance by ASIC mining equipment.” x16r has worked well so far those both goals, but the final goal of preventing future dominance by ASIC mining equipment is in question.

So, it seems like we should at least address the issue of potential ASICs.

Let me state up front that there are some opposing an algorithm change. I do not believe that these individuals have an ASIC, and therefore a hidden agenda. And they make a valid point that changing mining algorithms can be risky. During the transition, the mining hash rate can drop and the security can be reduced. If the existing miners don’t follow or additional miners aren’t added to make up the difference, then the security of Ravencoin is reduced. There is also the valid argument that a contentious hard fork, which ironically the folks leading the charge to keep it the same are more likely to make any hard fork a contentious one, may lead to chain splits. This older article about the Monero hard fork describes the outcome of at least one attempt to prevent ASICs. https://medium.com/@ecurrencyhodler/was-moneros-pow-change-a-success-81cfeaa08aae The final outcome was not terrible but led to several new coins like Monero Original, and Monero Classic, although Monero does not have the additional consideration of assets on their chain, and therefore has a lower bar to clear for a successful algorithm change.

So let’s explore the options.

Nothing

One option is to do nothing. The x16r algorithm is working. There isn’t clear and convincing evidence that ASICs are on the network yet.

This is not a proactive approach, and once there are ASICs on the network, it will require more of a non-ASIC community effort to change the algorithm because simple BIP9 voting will not work if the network is already dominated by those that want to keep the status quo.

Minimal

One option is to make a minor change that would be easy for GPU mining software to change. It would likely be one line. Perhaps an XOR with a specific 512-byte value after one of the algorithms. This would obsolete any existing ASICs, but would not have much effect beyond that. It would be easy for ASICs to adapt to that change for any ASICs that aren’t already manufactured. It would obsolete any custom ASIC chips that are already manufactured whether on-network or not.

Medium

Another option is to switch from x16r to x21s. x21s was created by one of the forks of Ravencoin. It has two differences from x16r. First, it uses every algorithm every time. As a refresher, x16r uses random algorithms based on the nibbles (half-a-byte) from the previous block hash. The nibbles which are just random numbers between 0 and 15 determine the order of the algorithms. If a nibble (number) occurs twice, then the algo will be used twice. The ‘s’ part of x21s prevents an algo from being used twice. Second, the 21 part of x21s adds an additional 5 algos to x16r. These algos are evenly slotted in between the 16 algos that are selected by the x16r algos after the duplicates are removed and all 16 have been used. The advantage to ‘s’ over ‘r’ is that it is easier to tune GPUs because there is less possible variance in load which has been known to cause power supply issues. It is has one other big advantage which we’ll cover later. The x21s may be a bit harder to create an ASIC, but only because it has 5 more algos to worry about. But it may be easier to make an ASIC because of the power stability that the ‘s’ (all algos) brings. I’ve had people privately contact me independently and suggest that ‘s’ makes ASICs easier, and another suggested that it makes it harder. I can’t see how ‘s’ would make it harder, and only varying power fluctuations might make it harder under ‘r’ Neither provided evidence or strong argument for their assertion.

Full

Another option is to make x17r or an x16rV2 with CNv4 as one of the algorithms. This is a non-starter. If you look at the x16r paper, you’ll find a graph of the relative speeds of the sixteen algorithms. One of the reasons we measured the relative speeds was to ensure that none were so out-of-line that it would make financial sense to wait for blocks that didn’t include certain algos. If you included CNv4 in the rotation of x17r, or even a modified x16rV2 where an algo was swapped out, it would open up other attack vectors to wait until there were 0 instances of CNv4 in the random selection. The reason for that is that CNv4 is up to 100,000 times slower than some other hashing algorithms. The existing algos in x16r are within an order of magnitude of each other.

CNv4, also known as CryptoNight R, is the current mining algorithm for Monero and does seem to have ASICs available for it. It is, by itself, an ASIC resistant algorithm, that also closes the gap between the performance of CPUs and GPUs. For example, a RYZEN ThreadRipper CPU can outperform 2:1 over a 1080 Ti. There are GPUs that are faster than CPUs, and CPUs that are faster than GPUs in hashing CNv4, so this algorithm puts CPUs and GPUs in the same ballpark.

Another option is to switch from x16r to x16r+CNv4. This simply chains the existing algorithm together with CNv4. This isn’t much different than just switching to CNv4 because the x16r algo can be pre-computed much faster than the CNv4 algo can run on any hardware. Even a mediocre CPU could feed the fastest CNv4 GPU with the output of the x16r calculation. The main advantage of keeping x16r (or something) in addition to CNv4 is to prevent sloshing of pure CNv4 hash power from Monero to Ravencoin when Ravencoin is more profitable. By having a separate algo, any custom CNv4 only hardware wouldn’t be able to mine Ravencoin. And it keeps the continuity of x16r branding and makes it harder to make a CPU-less ASIC and FPGA because they’d need all the algos of x16r and the CNv4. But, since CNv4 is so slow relative to x16r, it doesn’t add much over just CNv4.

Another option is x21s + CNv4. This is easy to build. We already have the code available from another project for x21s, and the output of CNv4 is a 256-bit hash, so the last step of x16r is to take 256 of the 512-bit hash. That could be replaced with CNv4. This has similar issues as x16r + CNv4 in that the x16r can easily be pre-computed to feed the much slower CNv4.

Another option is to switch from x16r to x22s by including CNv4 in the mix. This seems like a solid option. The importance of ‘s’ where all algorithms are used every time negates the attack vector of waiting until there are no CNv4 in the random list of algos. It makes FPGAs difficult, but not impossible, just because of the sheer number of algorithms that are included. Because CNv4 is randomly positioned in the randomly ordered algorithms, it makes it trickier to pre-compute the feed for CNv4, although it can be done. But it would be a different sequence of algos, and a different number of algos feeding CNv4 each block, so the optimization may not be worth the effort.

Advantages of x22s (w/CNv4)

Some future-proofing against ASIC, FPGA, GPU dominance.

Brings CPUs back into the fold.

Better decentralization.

A relatively straightforward change to make.

Both x21s and CNv4 have been tested in the field.

Benchmarks for CNv4 available.

Disadvantages

Risky change to make.

Harsh 1-time difficulty adjustment.

Lower perceived security with low hash rate number because of more work per hash.

CPU botnets possible/likely.

Other Considerations

RandomX is currently being vetted as a replacement for CNv4 for Monero. Once this is battle-tested it may be a good fit. Plug this in for CNv4 in the discussions above if the timing of a change allows Monero to be the test bed and if it lives up to expectations. I personally would not be comfortable getting in front of Monero on these complex algorithms. I would want several months of hackers trying to exploit weaknesses to gain economic mining advantage on XMR before subjecting RVN to that technical risk.

Transitioning

Transitioning from x16r to another algo is tricky without opening up vulnerabilities (51% attack vectors), introducing a new coin like Ravencoin Orignal (managed split), or potentially splitting the chain (unmanaged split). There are a few options available:

BIP9

Voting in the new algo is the most democratic and therefore the preferred method to activate a change. Using the same mechanism that Ravencoin used to activate assets, we would set a threshold (70%, 80%, 90%?) to activate. The higher the threshold, the safer the transition, and the harder it is to attain the threshold.

If the known pools and miners reject it, then it may not be the correct route to change algorithms. If the hidden (non-transparent) pools and miners reject it while the known pools accept it, then it becomes a more challenging task and more assessment is needed to determine if the reason for the non-transparency is potentially malicious.

Block Number

Setting a block number forces the transition, but is very risky as it’s possible that half adopt the software with the block number transition, and half don’t. That leads to a chain split without a mechanism for resolving the split. What happens when Bittrex chooses not to upgrade, and Binance switches? Now there are two different RVN on two different chains. This is a worst case scenario, but certainly worth thinking about.

Multi-Algo Transition

One option for transitioning from one algo to another, once the trigger has happened (BIP9 or Block Number) is to allow either algo to solve a block. This provides a smoother transition for miners and prevents a sharp drop in hash power leaving the coin temporarily vulnerable to a 51% attack. It’s worth stating again that this doesn’t solve the BIP9 or Block Number transition, but only the smooth transition from x16r to a new algo without a sharp discontinuous drop in hash power once the majority is on-board.

Risks

Any hard fork transition carries risk. The risk goes up considerably if everyone isn’t on the same page, and it seems like there are lots of strong opinions out there on this topic. It only takes one person to fork the code, keep mining, and spawn a Ravencoin Original. If Monero’s story is any indicator, those coins will fail or fade into irrelevance, but it confuses the message and can split the community.

Update 2019–07–18

After this post was released, it spawned some great analysis and conversation by some very informed people. Out of that conversation, and a couple of private conversations, it appears that the ‘r’ part of x16r is pretty successful at making ASICs more difficult.

More algos, combined with random algo selection (allowing duplicates) is better for ASIC resistance than more algos in sequence (X11), or more algos with no duplicates (x16s). More algos helps ASIC resistance, so x17r would be better than x16r. I’ve been told that the die size required is proportional to the square of the number of algos when using ‘r’. As it has been explained to me, this causes the cost of ASIC manufacturing to increase exponentially, with a linear increase in algos.

The reason for this update is that with new information, the optimum path looks different to me. Now, x22r looks better than x22s.

One leading option is x22rc with CNv4 as one of the guaranteed-to-be-included algorithms. CNv4 benefits CPUs. One way to implement this is to have 22 hashing algorithms to choose from (not including CNv4). Look at the last 11 bytes (22 nibbles) of the previous block hash and chain the algorithms in that order, EXCEPT one of the algorithms gets replaced by CNv4. To choose which one, the 22 nibbles get summed, and mod 22. That gives us a random number between 0 and 21. The algo sitting in that position in the sequence of 22 gets swapped out for CNv4.

Another leading option is using x22r. Similar to x16r, but with 6 more hashing algos. This option has less technical risk as long as the other 6 algos are similar in time profile to the current 16 algos. With the same number of miners, the perceived (published) hash rate drops by about 37% because we only count the final result hash. This option favors the existing mining base because it doesn’t appreciably increase the benefit to CPU only mining.

Update 2019–08–23

The definitive existence of an x16r ASIC has been revealed. It is for sale and has been evaluated by some community members. The good news is that the speed differential between a 6 card 1080 Ti rig and the available ASIC is negligible. It runs around 250MH at 1500W. A 6 card rig runs around 200MH at 1500W. At around $1000, it is cheaper to buy than assembling a rig and the ASIC is about 30% more efficient in power usage. This ASIC is not a threat to the network.

With that said, there are two considerations. 1) We would prefer that ASICs are not the preferred method of mining on the Ravencoin network. 2) We did a survey recently and the results indicated that the Ravencoin community feels the same.

As discussed above, we have looked at CNv4 with some additional algorithms as a replacement, but the logistical difficulties prevent using that in a timely manner. The mobile wallets from Medici Ventures verify each block by checking the hash and running the header through the x16r algorithm. The new tZero Wallet does the same. Both of these need the new algo change and CNv4 adds additional challenges for mobile.

So the plan changed to making a relatively small modification to the existing algo. At first, it was just going to be a strategically placed XORing of a 512-bit value to modify the output hash. This idea was rejected as we suspect that the routing system is in firmware and might be able to perform the XOR operation with a small time penalty after a firmware update. This is just speculation on our part about the internal workings of the ASIC.

With that in mind, we added a new algorithm called Tiger and put it in as the algorithm for 0xF, or the 16th algo. But since its output is 192 bit, and the input to the other algos is 512 bit, we just added it before the SHA-512 instead to make the output 512 bit.

After some discussion with other friendly developers, and the realization that an ASIC could still be used for every block that didn’t include the 0xF algo, we added it to two more. That greatly reduced the times the ASIC could be used. The Tiger hash is also added in front of Keccak-512 and Luffa-512. All other algos stayed the same.

The odds of a block not containing the 0xF algorithm is (15/16)¹⁶ which is around 35%. This means the ASIC could be used on more than 1/3 of the blocks. So that is why Tiger is in front of three algos. So now the probability that none of those three algorithms are selected for the block is (13/16)¹⁶ or about 3.5%. We can live with that.

This change is already coded. There are two GPU miners coded: ccminer from MinerMore or T-Rex miner. We expect there to be others by the transition date. Also, the built-in CPU miner has been updated.

The transition will be done on the timestamp in the block, and for mainnet will be done at 1569945600 or Tue Oct 01 2019 16:00:00 UTC. The timestamp has a range that is allowed in the block. It does not need to be strictly increasing, so it is possible that a block is mined with x16rV2 followed by a block mined with x16r, followed by another x16rV2 block. Only one algo will be allowed to mine a block, and the algo allowed will be dependent on the timestamp in the block header. The advantage of using the timestamp in the block header vs the block height is that the block height is not always known if the block hasn’t been attached to the chain yet.

This all sounds kinda complex, but it really isn’t. The algo changed in a pretty easy to adapt way for software miners, and hard for an ASIC that has already been manufactured in silicon.

The plan is to run a transitional test on testnet on Sept. 3rd.

We will make sure the source code and binaries are available for everyone once we’ve ensured it is working as intended.

Update 2019–08–27

Critical Update The Ravencoin release 2.5.0 is now available. This is a critical update for you to remain on the main chain. We are recommending that all nodes should update to this release as soon as possible to make sure that you are ready when the algorithm swaps on October 1st. If you do not upgrade, you will not be on the correct chain after Oct 1st 2019 16:00:00 UTC.

Binaries are available here: https://github.com/RavenProject/Ravencoin/releases/tag/v2.5.1

If you would like to compile it yourself, use the master branch from the Ravencoin offical github repo.