Speeding up elliptic curve multiplication through scalar decomposition for w-NAF

This technique that gives us two EC multiplications for the price of one will be crucial for Witnet as it keeps the gas cost of some parts of the protocol at a minimum.

I have been recently working on an algorithm for the decomposition of a scalar for w-NAF (Non-Adjacent Form) simultaneous multiplication. The ultimate goal is to speed up the point multiplication on elliptic curves and thus reduce the point multiplication time.

The simplest algorithm for the elliptic curve multiplication is the Double-and-add, see [2] , in which the idea is to use the binary representation of the scalar we want to multiply. It is clear that the cost of the algorithm depends directly on the length of the scalar, making the point multiplication for 256-bits long scalar quite expensive.

Double and add algorithm

There are other point multiplication algorithms using NAF or w-NAF representation which make the average of non-0 values smaller than in the Double-and-add algorithm, thus the computational complexity is slower. But still, for long scalars the cost of the algorithm is high.

In Section 3.5 of [1] is described an algorithm for speed up the point multiplication. The main idea is very simple, decompose a scalar that we need to multiply into two scalars with a smaller number of bits. This way, instead of doing a multiplication by a 256-bit-long scalar we just need to compute two parallel 128-bit multiplications. Note that the multiplication of a a single scalar cannot be parallelized, as the algorithm is state-dependent.

More precisely, suppose that we have a point multiplication kP, where P is a point of an elliptic curve E and k is a scalar smaller than the group order n. Then there exist two scalars k₁, k₂ in [0,…,n-1], having approximately half the bit-length of k, such that

and

The factor λ is the solution of the equation of an endomorphism ɸ over E.

Decomposing the scalar k

The idea of the algorithm is the following: let f be a function over ℤxℤ such that for every vector v=(a,b) in ℤxℤ,

Then we just need to find u=(k₁,k₂ ) in ℤxℤ such that f(u)=k. This way

Let’s construct u. There are two parts for this goal:

1- First we need to find two vectors v₁=(a₁ ,b₁ ), v₂=(a₂ ,b₂ ) in ℤxℤ such that:

v₁ and v₂ are independent over ℝ;

f(v₁ )=f(v₂ )=0;

both have small Euclidean norm, this means a₁²+b₁²≈n, and the same for v₂. This condition guaranties that each component k₁, k₂ is half-bit long than k.

2- Since v₁, v₂ are independent over ℝ, we know there exist α₁, α₂ in ℚ such that (k,0)= α₁v₁ +α₂v₂.

Let c₁ =[α₁], c₂ =[α₂] be the integer closest to α₁ and α₂ , and let consider v=c₁v₁ +c₂v₂. Then the equation u=(k,0)-v satisfies f(u)=f(k,0)-f(v)=k , which was what we needed. Finally, the decomposition of k is given by

But how do we find the vectors v₁ and v₂? For this purpose we use the extended Euclidean algorithm [1] for n and λ. This algorithm produces a sequence of equations

where s₀ =1, t₀ =0,r₀ =n and s₁ =0, t₁ =1, r₁= λ. The remainders rᵢ are strictly decreasing and non negative and so are the absolute values of sᵢ and tᵢ. Furthermore, for every i>=1,

Let rₗ be the smallest remainder such that rₗ>=√n, then we will choose as vectors v₁ and v₂

If we write v₁=(a₁,b₁) and v₂=(a₂,b₂) then we have c₁ =[b₂k/n ], c₂=[-b₁k/n]. The vector v is the sum v=c₁v₁+c₂v₂ and u=(k,o)-v is

This concludes the algorithm since the components of u are the decomposition of k.

As mentioned at the beginning of this post, this algorithm makes point multiplication much more efficient by reducing its computational cost. In Witnet, we need to perform expensive elliptic curve arithmetic inside Ethereum so as to verify eligibility proofs, and thus, every single optimization is key to reducing the cost of using the protocol. This scalar decomposition technique is definitely a great candidate to help us build more efficient software.