A lot of smart contracts use the SafeMath library. It prevents contracts from having incorrect results, but it does so by failing transactions instead of making them correct. Let’s instead try to do the math correctly. In this series, I will derive some advanced techniques. Today, I’ll make a better safeMul .

If you multiply two numbers, the result will be a number twice the size. In Ethereum, when you multiply two numbers, the result can be up to 512 bits. But Ethereum only gives you the lower half; it simply ignores the rest. This is a common practice in mathematics called modular arithmetic.

However, ignoring numbers is not an acceptable practice in accounting. Care needs to be taken to avoid it or someone will lose something valuable. A popular library called SafeMath detects when it happens and then fails the transaction. But what if you do not want your transaction to fail?

What if you want to multiply any numbers and have the complete result?

Spoiler alert: this is snippet of Solidity code will do that for you:

Optimized full 512 bit multiplication in Solidity.

But before we get into that, let’s define the problem precisely: We have two unsigned numbers a and b , both 256 bits in length and we want their product, a 512-bit number x .

Since this number is too large to be represented directly in code, we split it up into the least significant and most significant 256 bits, r₀ and r₁ respectively:

Where the square brackets with subscript represent the modulo operation and the lower-half brackets on the right represent the floor operation.

Schoolbook algorithm

The classical way of solving this problem is by long multiplication, the method we all learned in school. You split your large number into decimals, multiply the digits, and then add the intermediate results. This method also works in binary and other bases. Let me quickly show you how you would use it here:

Since we have 256 bit multiply build in, we can multiply any two 128 bit numbers and get the full result. So if we split our large number into groups of 128 bits we can compute all their products. Take a₀ and a₁ to respectively mean the least significant and most significant 128 bits of a , similarly for b :

Now the original numbers a and b can be written as:

If we substitute this in product equation it becomes:

Ignoring the constants, we now have four multiplications instead of one. But all four of them involve numbers less than 2¹²⁸ that can be computed directly. The result is still too large, so we still need two numbers r₀ and r₁ to represent it. I will skip the steps of how to get r₀ and r₁ from this expression. It is straightforward, but annoying because of the shifts and carries. The final result is:

Schoolbook algorithm for 512 bit multiplication.

(Note that Solidity, as of 0.4.18, actually fails to compile the above example because the compiler can not handle that many local variables. This is easily solved by inlining some expressions, but since it reduces readability I opted not to do that for this example.)

The two multiplications for i01 and i10 can be replaced by one using the Karatsuba algorithm, at the expense of a few more additions. Since additions are 3 gas and multiplications only 5, this is not worth it. But if you want to do larger multiplications (say 4096 bit) it is worth looking into these methods.

We have now solved the problem using two modulo operations, four divisions, six additions, two conditional branches, and no less than six multiplications. The entire function takes a bit over 300 gas. This is not bad, but the gas cost is almost two orders of magnitude larger than the 5 gas for a regular multiplication, or 90 for a standard safeMul .

We can do a lot better.

Chinese Remainder

So here’s the trick: We use the rather obscure mulmod instruction and the Chinese Remainder Theorem. In short, the theorem states thati f we know a number modulo 2²⁵⁶ and 2²⁵⁶ — 1 , we can compute its 512-bit representation cheaply. The function to do this, chineseRemainder , is described in a previous post. To use it here, we first need to compute our product in the two moduli:

The first one, x₀ is just a regular multiply, as it already truncates to 256 bits. The second one, x₁ , can be computed directly using a single mulmod operation. This is a rather unknown opcode that computes:

Put this together, and we have our new mul512 function:

512-bit multiplication.

The Solidity compiler, as of version 0.4.18, does not produce very optimal code here. The chineseRemainder function is so tiny it is not worth the call-overhead, so it should be inlined, but the compiler doesn’t do this. The compiler does recognize that M1 can be expressed efficiently as not(0) . Manually inlining results in an efficient multiplication function:

Optimized full 512 bit multiplication in Solidity.

It is our chineseRemainder function (two sub s and one lt ) with a mul and mulmod added. We use assembly to avoid an unnecessary branch. The total gas cost is about 60 gas, compared to 5 for a normal multiply and 90 for a standard safeMul . In fact, it is slightly cheaper to use mul512 and check that r1 is zero than it is to use safeMul !

Conclusion