The sample problem comes from a programming contest; it may just as well be given at an interview. Its core -- modular arithmetic with large numbers -- is at the heart of cryptographic libraries. The problem is: given a natural number n that can be as big as 10^6 , print the nine rightmost decimal digits of n! disregarding the trailing zeros. Although the final code is in C, we will be using Haskell for all development.

We chose Haskell because it lets us easily write down the well-defined, unambiguous specification:

spec :: Int -> Int spec = fromIntegral . lastnz9 . fact . fromIntegral fact :: Integer -> Integer -- n! fact n = product [1..n] lastnz9 :: Integral a => a -> a -- Last 9 decimal digits of a number, not counting trailing zeros lastnz9 n = drop0s n `mod` 10^9 drop0s n | (q,0) <- quotRem n 10 = drop0s q | otherwise = n

fact

drop0s

mod

The specification spec is perfect as the ideal program, clearly describing the intended result. It also runs. However, it is not something we want to run in real life: spec 10^6 will take a lot of time, and a whole lot of space. Indeed, 10^6! is about (2*10^5)^(10^6) , with at least 5 million decimal digits.

As the first step and with an eye towards for-loops in the final C code, we rewrite fact in the accumulator-passing style (the threaded accumulator corresponds to a mutable variable in C):

fact_accum' :: Int -> Int -> Integer -> Integer fact_accum' n i acc | i > n = acc | otherwise = fact_accum' n (i+1) (acc * fromIntegral i) fact_accum n = fact_accum' n 1 1

fact_accum_9' :: Int -> Int -> Int -> Int fact_accum_9' n i acc | i > n = acc | otherwise = fact_accum_9' n (i+1) (lastnz9 (acc * i)) fact_accum_9 n = fact_accum_9' n 1 1

fact_accum_9

spec

n

all (

-> spec n == fact_accum_9 n) [0..21]

True

The contest-running software rejects our solution as erroneous, without any details. After more and more testing, luckily we eventually stumble on the problem: although fact_accum_9 n gives the same answers as spec n for very many n , it deviates from the ideal for some special, isolated n , for example, 25 :

*FactLimit> spec 25 330985984 *FactLimit> fact_accum_9 25 80985984

fact_accum_9

spec

fact_accum

lastnz9

lastnz9

lastnz9 (i*y) === lastnz9 (i * lastnz9 y)

i

10^9

mod

lastnz9

i

10

drop0s

lastnz9

i

25

y

4

y

k*10^9 + 4r

k>=0

0 < 4r < 10^9

lastnz9 (25 * (k*10^9 + 4r)) === lastnz9 (100 * (k*10^9/4 + r)) === lastnz9 (k*10^9/4 + r) | === r if k is divisible by 4 | === lastnz9 (k*25*10^7 + r) otherwise lastnz9 (25 * lastnz9 (k*10^9 + 4r)) === lastnz9 (25 * 4r) === r

k

4

drop0s (fact 24) `div` 10^9

62044840173

Let us make a few observations:

The mod operation distributes over multiplication: (x*y) `mod` p === ((x `mod` p) * (y `mod` p)) `mod` p . If p=10^9 , we merely have to deal with the numbers up to 10^9 , which comfortably fit within 32-bit signed integers, with a 64-bit accumulator for the intermediate result of the multiplication. The problem in the earlier program was the subtle interaction of mod and drop0s . If n! has any factors of 5 , it has even more factors of 2 . That is, drop0s n! is divisible by 2 but not divisible by 5 . All zeros of n! come from the factors of 5 contributed by the multiplicands n , n-1 , n-2 ,... If we suppress the factors 5 during the accumulation, drop0s becomes redundant. We can take the full advantage of point 1. To elaborate on the last point: suppose we have already computed drop0s n! . By point 2, it is representable in the form 2^k * r , for some odd r not divisible by 5 . Suppose n+1 factors as 2^i * 5^j * s where s is likewise odd and not divisible by 5 . Then drop (n+1)! is 2^(k+i-j) * (r*s) . We are sure, by point 2, that k+i>j . It is enough to compute r , s and their product mod 10^9 , using the property of mod from point 1.

-- return (n',k) such that n == n'*f^k factors n f = go 0 n where go k n | (q,0) <- quotRem n f = go (k+1) q | otherwise = (n,k) modulus = 10^9 factl :: Int -> Int factl n = go 2 0 1 where go i twos acc | i > n = shifts acc twos go i twos acc = let (i1,k1) = factors i 2 (i2,k2) = factors i1 5 in go (i+1) (twos + k1 - k2) (acc * i2 `mod` modulus) -- compute (acc * 2^twos) `mod` modulus shifts acc 0 = acc shifts acc twos | twos >= 4 = shifts ((acc `shift` 4) `mod` modulus) (twos - 4) | otherwise = (shift acc twos) `mod` modulus

shifts

2^twos

mod

It is rather easy to transcribe the above algorithm in C (see below). The C code worked on first try and was accepted as the answer to the challenge.

Thus, tests -- and types -- are necessary and should be used, as a spot-check for corner cases and silly mistakes made in transcribing the idea into code. Useful the tests and the types as they are, for verification, they are not sufficient. They have to be complemented with other ways to be sure in the code, such as thinking about the problem mathematically. Proofs are not sufficient either and have to be complemented with tests to spot-check for realism of their assumptions and adequacy of their conclusions. All in all, there is no royal road -- or any road, it seems -- to the correct code, as there are none to an invention or discovery.