Grade-school Long Multiplication with Strings

From time to time I enjoy a quick exercise on HackerRank — especially the Algorithms section. After all, those pesky Google interviewers are always asking about Quicksort! 😏

I particularly enjoyed the “extra long factorials” exercise, since it demands some deeper thought if you don’t want to use a BigNum library.

The Setup

Here’s the problem:

You are given an integer N. Print the factorial of this number.

That’s not too bad.

5! would be 5 x 4 x 3 x 2 x 1 = 120. Can do.

Input consists of a single integer N, where 1 <= N <= 100.

Oh.

10! is 3,628,800 so I can see this scales quickly.

For an input of 25, you would print 15511210043330985984000000.

Very quickly.

Factorials of N > 20 can’t be stored even in a 64 bit long long variable.

True.

The max unsigned value for a 64 bit integer would be 18,446,744,073,709,551,615. (2⁶⁴)

Big integers must be used for such calculations. Languages like Java, Python, Ruby etc. can handle big integers, but we need to write additional code in C/C++ to handle huge values. We recommend solving this challenge using BigIntegers.

Challenge Accepted

I choose C. I also choose not to use BigIntegers.

My intention for solving the solution was to avoid reinventing the entire BigNum wheel — I figured I’d just implement the Standard Algorithm (grade-school long multiplication) using string manipulations so I could support an absurdly large factorial like 999!.

Strings and Math

Why string manipulations?

My wife is finishing her Masters degree in Math Education Leadership. As someone who was once dubious about our American education system (then I married a teacher and got a significant on-the-ground look) and the usefulness of Common Core it’s been an amazing opportunity to learn about the challenges both students and teachers face when teaching math.

Even something as “intuitive” and “natural” as the positional decimal numeral system we use in daily life (9+1 = 10, 99+1 = 100, …) eluded history’s greatest minds for thousands of years — obviously there’s something special about it, and yet it becomes a rote, reason-less memorization for students often severely lacking in explanation. Students who then become teachers.

I’ll never forget the night Amy came home from one of her graduate classes and told me all about how they’d learned to convert numbers between arbitrary bases: my non-programmer wife came home understanding binary, octal, and hexadecimal and could manipulate them with confidence!

It all came down to having a finite set of symbols, and zero.

Say you have the digits 0, 1, 2, 3, and 4 and nothing else. How do you count past 4?

Base 5 | Base 10

0 = 0

1 = 1

2 = 2

3 = 3

4 = 4

10 = 5

11 = 6

12 = 7

13 = 8

14 = 9

20 = 10

All we’re doing is manipulating symbols — when we run out of unique symbols we shift once to the left and insert a zero.

When confronted by this problem it made me ask: “why not try solving it like a human using Base 10 would?”

Thinking Like a Human

A human, when confronted with a multiplication problem, is generally able to solve it either with a tool like a calculator or by hand using the long multiplication method (also called the Standard Multiplication Algorithm).

When performed by hand it’s possible to multiply any two integers together, regardless of length, as long as there’s enough paper to store the work on — just like computer memory.

However, this presupposes certain arithmatic knowledge, particularly a memorization of the times table.

Multiplication Table in Base 10 (Decimal)

In reality the multiplication table we memorize as children is just a lookup table to give us a shortcut for not having to add up the sums for the integers. (Reminder: multiplication is not actually repetitive addition).

Armed with a multiplication table for digits 0 through 9 we can perform multiplication on integers of any length using long multiplication:

Long Multiplication With Explanation of Place Value and Zero

Algorithmic Steps for Long Multiplication

Here’s the Wikipedia pseudo-code for performing long multiplication:

// Operands containing rightmost digits at index 1

multiply(a[1..p], b[1..q], base)

//Allocate space for result

product = [1..p+q]

// for all digits in b

for b_i = 1 to q

carry = 0

//for all digits in a

for a_i = 1 to p

product[a_i + b_i - 1] += carry + a[a_i] * b[b_i]

carry = product[a_i + b_i - 1] / base

product[a_i + b_i - 1] = product[a_i + b_i - 1] mod base

// last digit comes from final carry

product[b_i + p] += carry

return product

We’re going to do two important things with our string version:

Treat single digit multiplication as a native operation in C (e.g. convert chars to ints and multiply) as an analogy to the advantage humans have in knowing the multiplication table Hold the results of operations in strings

First Pass

HackerRank doesn’t really care how well formatted or how well documented your code is. It just runs tests against it for correctness.

The problem with this is that you’re tempted to just blaze through the problem and end up with something like this implementation:

#include <math.h>

#include <stdio.h>

#include <string.h>

#include <stdlib.h> int main() {

// Support input from 0 to 999

char n[4] = {0};

scanf("%s",n);



// 999! has 2565 digits, plus a null terminator

char expansion[2566] = {0};



// The initial value will be the input (ex: 999 x 998 x 997

// so fill with chars for 999)

for (int i = 0; i < strlen(n); i++) {

expansion[i] = n[i];

}



// Start at one less than the input (ex: 998), and as

// long as we're greater than 1 keep decrementing

// (we don't really need to multiply by 1)

int start = atoi(n)-1;

for (int i = start; i > 1; i--) {

sprintf(n, "%d", i);

int expansion_d = strlen(expansion);

int n_d = strlen(n);

int total_d = expansion_d + n_d;



int totals[total_d];

for (int x = 0; x < total_d; x++) {

totals[x] = 0;

}



int reset = 0;

for (int j = n_d - 1; j >= 0; j--) {

int p = total_d - 1 - reset++;

for (int k = expansion_d-1; k >= 0; k--) {

int top = expansion[k] - '0';

int bottom = n[j] - '0';

totals[p--] += top*bottom;

}

}



for (int p0 = total_d - 1; p0 >= 0; p0--) {

if (totals[p0] >= 10) {

int carry = totals[p0] / 10;

totals[p0] %= 10;

totals[p0 - 1] += carry;

}

}



int notzero = 0;

int position = 0;

for (int p0 = 0; p0 < total_d; p0++) {

if (!notzero && totals[p0] > 0) {

notzero = 1;

}



if (notzero) {

expansion[position] = totals[p0] + '0';

position++;

}

}

}



printf("%s

", expansion);



return 0;

}

999! turns out to be:

402387260077093773543702433923003985719374864210714632543799910429938512398629020592044208486969404800479988610197196058631666872994808558901323829669944590997424504087073759918823627727188732519779505950995276120874975462497043601418278094646496291056393887437886487337119181045825783647849977012476632889835955735432513185323958463075557409114262417474349347553428646576611667797396668820291207379143853719588249808126867838374559731746136085379534524221586593201928090878297308431392844403281231558611036976801357304216168747609675871348312025478589320767169132448426236131412508780208000261683151027341827977704784635868170164365024153691398281264810213092761244896359928705114964975419909342221566832572080821333186116811553615836546984046708975602900950537616475847728421889679646244945160765353408198901385442487984959953319101723355556602139450399736280750137837615307127761926849034352625200015888535147331611702103968175921510907788019393178114194545257223865541461062892187960223838971476088506276862967146674697562911234082439208160153780889893964518263243671616762179168909779911903754031274622289988005195444414282012187361745992642956581746628302955570299024324153181617210465832036786906117260158783520751516284225540265170483304226143974286933061690897968482590125458327168226458066526769958652682272807075781391858178889652208164348344825993266043367660176999612831860788386150279465955131156552036093988180612138558600301435694527224206344631797460594682573103790084024432438465657245014402821885252470935190620929023136493273497565513958720559654228749774011413346962715422845862377387538230483865688976461927383814900140767310446640259899490222221765904339901886018566526485061799702356193897017860040811889729918311021171229845901641921068884387121855646124960798722908519296819372388642614839657382291123125024186649353143970137428531926649875337218940694281434118520158014123344828015051399694290153483077644569099073152433278288269864602789864321139083506217095002597389863554277196742822248757586765752344220207573630569498825087968928162753848863396909959826280956121450994871701244516461260379029309120889086942028510640182154399457156805941872748998094254742173582401063677404595741785160829230135358081840096996372524230560855903700624271243416909004153690105933983835777939410970027753472000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

WolframAlpha agrees about what 999! evaluates to, which is also where the magic number 2566 comes from.

The biggest problem with this code is not the lack of optimization, but the difficulty in proving correctness without executing it. Reading it even a few hours after writing left me scratching my head as I retraced my logic for performing the factorial in place.

Lesson: even if they’re just throwaway toy scripts and programming exercises there’s no excuse for failing to write self-documenting code and providing useful comments.

An Explanation

We’re going to need strlen() and atoi() if we really don’t want to reinvent the wheel.

#include <string.h>

#include <stdlib.h>

2. The magic numbers 4 and 2566 could easily be replaced with pointers to char strings if we wanted to have larger factorials: the final expanded form of the factorial would just need to allocate more memory if the digits exceeded the current allocation.

3. We’re performing the multiplication in-place by storing the initial value of n as chars in the expansion string, then multiplying each digit by the subsequently decremented value.

4. The total number of digits needed to store the result (max possible) is calculated in each pass through the outer loop. We know this will always be, in the worst case, the sum of the digits of both operands. Trivial proof: 999x999 = 998,001 (3 digits + 3 digits = 6 digits)

5. We use an int array totals to store temporary results (akin to human multiplication table results being written on paper)

6. The + '0' and - '0' you see are shortcuts for the ASCII char value of 0 (which is 48 in decimal) and how we can find the char offsets of the ints we’ve stored to write into the string

7. This method, before doing anything else with “carrying” numbers, calculates the raw results of the multiplication of each top and bottom digit from right to left and stores in the array

8. Subsequent pass handles the “carrying” and place value incrementation