Let's put the pieces the puzzle together now and implement a simple RSA cryptosystem in Python. As we will see, each piece of the implementation here ends up being related to interesting and well-studied areas of mathematics and computer science.

Key Generation and Primality Tests

To start, our key generation requires us to generate two primes p p p and q . q. q. We note above that if n = p q n=pq n=pq is small enough then an attacker can simply factor n n n to get the two primes, use them to calculate ϕ ( n ) \phi(n) ϕ(n) and from there calculate d = e − 1 . d=e^{-1}. d=e−1. Hence we need n n n and therefore p p p and q q q to be large.

To generate a large prime p , p, p, we start by deciding the number of bits we require in the prime. This number is one of the key parameters that determines the strength of our cryptosystem. Let's call this number b b b and let's assume for simplicity that b > 8 b>8 b>8 for all our intended purposes. In practice, b b b is usually between 512 512 512 and 2048. 2048. 2048. As I typed this, I decided to check Facebook and GMail to see what key strength they use in their secure TLS connections. Let's look at Facebook first:

OpenSSL> s_client -connect facebook.com:443 CONNECTED(00000003) [...] New, TLSv1/SSLv3, Cipher is AES128-GCM-SHA256 Server public key is 2048 bit

So 2048 2048 2048 bit keys are used here. This means each prime is 1024 1024 1024 bits long. And GMail:

OpenSSL> s_client -connect mail.google.com:443 CONNECTED(00000003) [...] New, TLSv1/SSLv3, Cipher is ECDHE-RSA-AES128-GCM-SHA256 Server public key is 1024 bit

So GMail is using 1048 1048 1048 bit keys, meaning each prime is 512 512 512 bits long.

To generate a b − bit b-\text{bit} b−bit prime p p p we start by generating b b b bits of random data, and then manually making sure both the leading and trailing bits are ones to ensure the number is of size greater than 2 b − 1 2^{b-1} 2b−1 (we don't want to generate, say, 3 3 3 by chance, no matter how small the probability) and that it is an odd number. (Side note on entropy of the primes generated: forcing these bits to be 1 1 1 here reduces the true entropy of the primes, more on this in a bit.) Let t t t denote the number given by the manipulated random bits as described above. Then we proceed to check t + i t+i t+i for primality starting from i = 0 i=0 i=0 and onwards. As soon as a prime is reached, we let p p p be equal to it. It is generally a good idea to give up on an interval after a certain number of attempts. To do this effectively, we need to consider the Prime Number Theorem which, letting π ( n ) \pi(n) π(n) denote the number of primes less than or equal to n , n, n, tells us:

lim ⁡ n = 1 ∞ π ( x ) n ln ⁡ ( n ) = 1. \lim_{n=1}^{\infty}\frac{\pi(x)}{\frac{n}{\ln(n)}}=1. n = 1 lim ∞ ​ ln ( n ) n ​ π ( x ) ​ = 1 .

In other words, there are roughly n ln ⁡ ( n ) \frac{n}{\ln(n)} ln(n)n​ primes smaller than or equal to n n n and the larger n n n gets, the better this estimate becomes. This translates directly to the heuristic that around 2 b , 2^b, 2b, for b > 8 b>8 b>8 or so, a prime appears in about every b b b consecutive numbers. However, this is only a heuristic, and large gaps in prime numbers exist. To avoid being trapped in one of these large gaps, we decide to give up on t t t after say, 2 b 2b 2b numbers consecutive to it have been tested and failed. In such a case, we simply generate b b b new random bits and repeat the process all over. The end result ends up looking like this:

def generate_random_prime ( bits , primality_test ): """Generate random prime number with n bits.""" get_random_t = lambda : random . getrandbits ( bits ) | 1 << bits | 1 p = get_random_t () for i in itertools . count ( 1 ): if primality_test ( p ): return p else : if i % ( bits * 2 ) == 0 : p = get_random_t () else : p += 2 # Add 2 since we are only interested in odd numbers

This function gets in an infinite loop if there are no b − bit b-\text{bit} b−bit prime numbers. This can only happen with small b , b, b, so we ignore this possible source of bug at this point in favour of an efficient implementation.

Next, we notice in the above that the primality test is the very core of the algorithm. Let's first start by trying a very simple primality test and seeing its performance. To measure the performance, the logged decorator from the Python decorators post is used. First the primality test:

@logged ( "%b %d %Y - %H:%M:%S" ) def simple_is_prime ( n ): """Returns True if n is a prime. False otherwise.""" if n % 2 == 0 : return n == 2 if n % 3 == 0 : return n == 3 k = 6L while ( k - 1 ) ** 2 <= n : if n % ( k - 1 ) == 0 or n % ( k + 1 ) == 0 : return False k += 6 return True

This is the somewhat more evolved version of the algorithm that checks n n n against divisibility by all numbers smaller than or equal to ⌊ n ⌋ . \lfloor \sqrt{n}\rfloor. ⌊n ​⌋. Our version first tests 2 2 2 and 3 3 3 and afterwards against numbers of the form k = 6 i ± 1 k=6i\pm{}1 k=6i±1 since these are the only numbers not divisible by 2 2 2 and 3. 3. 3. Let's look at the efficiency:

>>> for b in [ 8 , 16 , 32 , 44 , 45 , 46 ]: ... print "Generating %d -bit prime: " % b ... print generate_random_prime ( b , simple_is_prime ) ... Generating 8-bit prime: - Running 'generate_random_prime' on Aug 21 2013 - 13:53:42 - Finished 'generate_random_prime', execution time = 0.000s 337 Generating 16-bit prime: - Running 'generate_random_prime' on Aug 21 2013 - 13:53:42 - Finished 'generate_random_prime', execution time = 0.000s 129737 Generating 32-bit prime: - Running 'generate_random_prime' on Aug 21 2013 - 13:53:42 - Finished 'generate_random_prime', execution time = 0.027s 6088203911 Generating 44-bit prime: - Running 'generate_random_prime' on Aug 21 2013 - 13:53:42 - Finished 'generate_random_prime', execution time = 0.876s 34009683800003 Generating 45-bit prime: - Running 'generate_random_prime' on Aug 21 2013 - 13:53:43 - Finished 'generate_random_prime', execution time = 1.139s 44586027757781 Generating 46-bit prime: - Running 'generate_random_prime' on Aug 21 2013 - 13:53:44 - Finished 'generate_random_prime', execution time = 2.215s 87012671597869

On my computer, around b = 42 b=42 b=42 is when the time required to generate a prime becomes noticeable. Of course, one run is not a very accurate measurement at all because of the randomness in the generation algorithm. However, we can easily see that while the algorithm above is of order O ( n ) O(\sqrt{n}) O(n ​) in n , n, n, it is of order O ( 2 b ) O(2^b) O(2b) in b . b. b. In other words, it is exponential in b . b. b.

Fortunately, the problem of determining whether a number is prime or not is in P . P. P. This means there is a deterministic polynomial time (polynomial in b b b and not n n n as shown above) algorithm for it. The first deterministic algorithm to accomplish this is called AKS which is named after the authors of the seminal paper "PRIMES is in P" by Agrawal, Kayal, and Saxena. You can read the original paper here.

However, a practical implementation of this algorithm ends up being quite inefficient, though still better than our simple algorithm above for large numbers. One of the better algorithms is Rabin-Miller which is also polynomial time in b , b, b, but unlike AKS, Rabin-Miller is not deterministic. This means that the algorithm makes use of random numbers, and there is a chance it might not return the correct result at the end. With the proper choice of parameters, Rabin-Miller can reduce the probability of a false positive to astronomically small though, while still being quite efficient.

The basic idea of Rabin-Miller algorithm is to notice that if n n n is prime then for any 0 < x < p 0<x<p 0<x<p we have x n − 1 ≡ 1 ( m o d n ) , x^{n-1}\equiv 1 \pmod{n}, xn−1≡1(modn), as we already saw. Let n − 1 = 2 s m n-1=2^sm n−1=2sm where m m m is odd. Now consider the finite sequence of s + 1 s+1 s+1 numbers

x i = x 2 i m m o d n for 0 ≤ i ≤ s . x_i = x^{2^im} \mod n \quad \text{ for } 0 \le i \le s. x i ​ = x 2 i m m o d n for 0 ≤ i ≤ s .

Notice that the last number in the sequence x s = x 2 s m = x n − 1 = 1 m o d n . x_s=x^{2^sm}=x^{n-1}=1 \mod n. xs​=x2sm=xn−1=1modn. . The sequence was defined based on the idea that each number is the previous number, squared, modulo p p p . That is, x i + 1 = x i 2 m o d n . x_{i+1}=x_i^2 \mod n. xi+1​=xi2​modn. Because of this, we notice that if x 0 = 1 x_0=1 x0​=1 or x 0 = − 1 x_0=-1 x0​=−1 then the rest of x i = 1 x_i=1 xi​=1 for i > 0 i>0 i>0 and there is not much information we can gain from this. Suppose that is not the case. Then 2 2 2 divides the multiplicative order of x x x (because x 2 i s ≡ 1 ( m o d p ) x^{2^is} \equiv 1 \pmod{p} x2is≡1(modp) for some i > 0 i>0 i>0 ) which means that − 1 -1 −1 is in the multiplicative subgroup generated by x . x. x. This is given by the fact that − 1 -1 −1 is the unique element in Z p \mathbb{Z}_p Zp​ with order 2 2 2 (because x 2 − 1 x^2-1 x2−1 can not have any more than two roots) and Cauchy's theorem. Hence we must have x i = − 1 x_i=-1 xi​=−1 for some i . i. i. If no such i i i exists, then p p p is certainly not a prime. This is the main condition used in Rabin-Miller.

If x x x gives us a sequence x i x_i xi​ that does not satisfy the above requirement, then x x x is said to be a witness to n ’s n\text{'s} n’s compositeness, because we can use x x x to prove n n n is composite. That is, x x x testifies that n n n is a composite, hence the choice of word for it.

In algorithm form we get the following.

Rabin-Miller Algorithm Given n n n first calculate s s s and m m m such that n − 1 = 2 s m n-1= 2^sm n − 1 = 2 s m and m m m is odd. Then pick a random 0 < x < p 0 < x < p 0 < x < p and calculate x 0 = x m m o d n . x_0=x^m \mod n. x 0 ​ = x m m o d n . If x 0 x_0 x 0 ​ is either 1 1 1 or − 1 -1 − 1 then x x x is not a witness. Try another x . x. x . Otherwise, calculate x 1 = x 0 2 m o d n , x_1=x_0^2 \mod n, x 1 ​ = x 0 2 ​ m o d n , and x 2 = x 1 2 m o d n x_2=x_1^2 \mod n x 2 ​ = x 1 2 ​ m o d n up to x s = x s − 1 2 m o d n x_s=x_{s-1}^2 \mod n x s ​ = x s − 1 2 ​ m o d n by squaring the previous term modulo n . n. n . If 1 1 1 is encountered before − 1 -1 − 1 then x x x is a witness and n n n is a composite. Otherwise if x s = 1 x_s=1 x s ​ = 1 and − - − is encountered before then x x x is not a witness. Try another x . x. x . Once "enough" x x x 's have been tried without any witnesses being found, the we can conclude that p p p is "most likely" a prime.

Why is it that at the end we can't conclude with certainty that p p p is a prime? Suppose we pick an x x x and the generated sequence x i x_i xi​ satisfies the requirements. Does that mean n n n is a prime? Answer is no, because for some x x x composites will also satisfy the requirement above too. However, if we pick many possible witnesses and try them all, and they all fail to testify to n n n then we have more and more evidence toward n n n being prime. If we check all such x x x then we can be certain, but then the algorithm would be exponential in b b b again which is what we are avoiding. However, if we could have some idea of the probability of x x x being a witness for a composite n n n then we can come up with a reasonable number of random attempts at witnesses before we can say that n n n is almost certainly a prime.

And this is the key part of Rabin-Miller's paper, where they prove for a given composite n , n, n, at least 3 / 4 ^3/_4 3/4​ of numbers 0 < x < n 0<x<n 0<x<n are witnesses to n n n being a composite. (The proof for this claim is rather involved, but have a look at the original paper if you are feeling brave!) This means that if we pick k k k random possible witnesses given that n n n is composite, we will have a probability of ( 1 4 ) k (\frac{1}{4})^k (41​)k of none of them being a witness. So picking a k k k value of say 50 50 50 will give us a probability of ( 1 2 ) 100 (\frac{1}{2})^{100} (21​)100 of outputting a false positive, that is, outputting PRIME when the number is a composite. This is good enough for our purposes here.

Also, in practice, one can make the Rabin-Miller algorithm much more efficient by implementing a basic test of divisibility by the first, say, 100 100 100 primes before moving on to Rabin-Miller.

Let's look at a Python implementation below.

def rabin_miller_is_prime ( n , k = 20 ): """ Test n for primality using Rabin-Miller algorithm, with k random witness attempts. False return means n is certainly a composite. True return value indicates n is *probably* a prime. False positive probability is reduced exponentially the larger k gets. """ b = basic_is_prime ( n , K = 100 ) if b is not None : return b m = n - 1 s = 0 while m % 2 == 0 : s += 1 m //= 2 liars = set () get_new_x = lambda : random . randint ( 2 , n - 1 ) while len ( liars ) < k : x = get_new_x () while x in liars : x = get_new_x () xi = pow ( x , m , n ) witness = True if xi == 1 or xi == n - 1 : witness = False else : for __ in xrange ( s - 1 ): xi = ( xi ** 2 ) % n if xi == 1 : return False elif xi == n - 1 : witness = False break xi = ( xi ** 2 ) % n if xi != 1 : return False if witness : return False else : liars . add ( x ) return True

In the code above, basic_is_prime simply checks the number against divisibility by the first K numbers. Here's the source for it, which relies on a list primes_list.less_than_hundred_thousand to have been set already.

def basic_is_prime ( n , K =- 1 ): """Returns True if n is a prime, and False it is a composite by trying to divide it by two and first K odd primes. Returns None if test is inconclusive.""" if n % 2 == 0 : return n == 2 for p in primes_list . less_than_hundred_thousand [: K ]: if n % p == 0 : return n == p return None

Another part of the algorithm above that we get for free in Python but is worth studying a bit further is the pow(x, m, n) function which calculates x m ( m o d n ) x^m \pmod{n} xm(modn) efficiently. How? The naive algorithm would simply multiply x x x by itself m m m times, but Python's p o w pow pow function is smarter than that and uses a method known as "exponentiation by squaring", which runs in O ( log ⁡ m ) O(\log m) O(logm) time. The basic idea is simple:

x m = { ( x m 2 ) 2 if m is even x ⋅ ( x m − 1 2 ) 2 if m is odd . x^m= \begin{cases} (x^{\frac{m}{2}})^2 & \text{ if } m \text{ is even} \\ x \cdot (x^{\frac{m-1}{2}})^2 & \text{ if } m \text{ is odd}. \end{cases} x m = { ( x 2 m ​ ) 2 x ⋅ ( x 2 m − 1 ​ ) 2 ​ if m is even if m is odd . ​

Which leads to a divide and conquer algorithm. This is what a simple iterative version of this algorithm would look like in Python. However, use the built-in pow(x, m, n) function in Python instead of this, since it's going to be faster.

def power ( x , m , n ): """Calculate x^m modulo n using O(log(m)) operations.""" a = 1 while m > 0 : if m % 2 == 1 : a = ( a * x ) % n x = ( x * x ) % n m //= 2 return a

Now, let's try generating some primes using this test instead:

>>> for b in [ 2 ** i for i in xrange ( 3 , 12 )]: ... print "Generating %d -bit prime: " % b ... print generate_random_prime ( b , rabin_miller_is_prime ) ... Generating 8-bit prime: - Running 'generate_random_prime' on Aug 28 2013 - 20:32:32 - Finished 'generate_random_prime', execution time = 0.000s 503 Generating 16-bit prime: - Running 'generate_random_prime' on Aug 28 2013 - 20:32:32 - Finished 'generate_random_prime', execution time = 0.001s 129757 Generating 32-bit prime: - Running 'generate_random_prime' on Aug 28 2013 - 20:32:32 - Finished 'generate_random_prime', execution time = 0.002s 5483745739 Generating 64-bit prime: - Running 'generate_random_prime' on Aug 28 2013 - 20:32:32 - Finished 'generate_random_prime', execution time = 0.003s 20207866191915625427 Generating 128-bit prime: - Running 'generate_random_prime' on Aug 28 2013 - 20:32:32 - Finished 'generate_random_prime', execution time = 0.008s 378992665877954075225130181634050346399 Generating 256-bit prime: - Running 'generate_random_prime' on Aug 28 2013 - 20:32:32 - Finished 'generate_random_prime', execution time = 0.013s 169109964058021108360022059334779755034470159052847134197148573143017688247123 Generating 512-bit prime: - Running 'generate_random_prime' on Aug 28 2013 - 20:32:32 - Finished 'generate_random_prime', execution time = 0.062s 25984244777989501997155542913156869882081307629782603141084479047977114447740881200477139623732813907010625928930923928392846989307780602640371132948572817 Generating 1024-bit prime: - Running 'generate_random_prime' on Aug 28 2013 - 20:32:32 - Finished 'generate_random_prime', execution time = 0.908s 243882388542316536634494263003432392747710195101405655638525548586460245284788605889744383018972173790715282433484881304387337823469782781741768460522379459266510277700837051129455157916910581124086417098371511903523124351405930407269823757127633787628818261404275424247249951449105500355258546878369871013997 Generating 2048-bit prime: - Running 'generate_random_prime' on Aug 28 2013 - 20:32:33 - Finished 'generate_random_prime', execution time = 5.533s 39776036539901478897149714325867816317685970320648758660437997010070791355113972360487704959106715808827982189462166321507780789030474348830734451482712904952797349414199357350610796728394943895699307545779410349704150772133800816078777678410296457461265358640627126060911629419050162680462367824593593599185377995313654142814587795430128247012676213933608425620507345628348437045547639588191826106308640857920441463747225809423632900984055482095205841726987368122970422928048300140432181860090536264169603189938581906988427448587564353276787633192654461220520187517077846414138298337591988489586417244684890216725019

As you can see, primes up to 512 512 512 bits are generated in negligible time. Even a 2048 2048 2048 bit prime is generated in about five seconds which is quite impressive for a rather simple Python implementation.

Finally, let's put it all together to implement RSA, for which we will need an implementation of the extended greatest common divisor algorithm to determine the modular multiplicative inverse of a number.

def extended_gcd ( a , b ): """Returns pair (x, y) such that xa + yb = gcd(a, b)""" x , lastx , y , lasty = 0 , 1 , 1 , 0 while b != 0 : q , r = divmod ( a , b ) a , b = b , r x , lastx = lastx - q * x , x y , lasty = lasty - q * y , y return lastx , lasty def multiplicative_inverse ( e , n ): """Find the multiplicative inverse of e mod n.""" x , y = extended_gcd ( e , n ) if x < 0 : return n + x return x def rsa_generate_key ( bits ): p = generate_random_prime ( bits / 2 ) q = generate_random_prime ( bits / 2 ) # Ensure q != p, though for large values of bits this is # statistically very unlikely while q == p : q = generate_random_prime ( bits / 2 ) n = p * q phi = ( p - 1 ) * ( q - 1 ) # Here we pick a random e, but a fixed value for e can also be used. while True : e = random . randint ( 3 , phi - 1 ) if fractions . gcd ( e , phi ) == 1 : break d = multiplicative_inverse ( e , phi ) return ( n , e , d ) def rsa_encrypt ( message , n , e ): return modular . power ( message , e , n ) def rsa_decrypt ( cipher , n , d ): return modular . power ( cipher , d , n )

And let's look at it in action:

>>> n , e , d = rsa_generate_key ( 8 ) >>> print n , e , d 589 299 419 >>> message = 123 >>> cipher = rsa_encrypt ( message , n , e ) >>> print cipher 309 >>> decrypted_message = rsa_decrypt ( cipher , n , d ) >>> print decrypted_message 123

And same thing but with a much larger key, giving a more secure encryption scheme:

>>> n , e , d = rsa_generate_key ( 64 ) >>> print n , e , d 49386969680342799959 39516096781065686051 14613413672463512651 >>> message = 123 >>> cipher = rsa_encrypt ( message , n , e ) >>> print cipher 19163520560172629876 >>> decrypted_message = rsa_decrypt ( cipher , n , d ) >>> print decrypted_message 123

And there it is. A very basic implementation of RSA that is still capable of handling rather large keys. Notice that the encryption and decryption algorithms are basically just modular exponentiation. Because of its importance in RSA's efficiency, modular exponentiation has been studied quite a bit in applied cryptography. A more efficient implementation would be using Montgomery reduction which I might go into in a future post.