Introduction

This is a follow-up to a post from several years ago describing a C++ implementation of arbitrary-precision unsigned integer arithmetic. This weekend I extended this to also support arbitrary-precision signed integers and rational numbers. Although this started as an educational tool, it now feels a bit more complete, and actually usable for the combinatorics and probability applications of the sort that are frequently discussed here.

I tried to stick to the original objectives of relatively simple and hopefully readable code, with stand-alone, header-only implementation, as freely available in the public domain as legally possible.

The code is available here, as well as on GitHub, in three header files:

#include "math_Unsigned.h" defines a math::Unsigned type representing the natural numbers with all of the sensible arithmetic, bitwise, and relational operators, essentially everything except bitwise one’s complement… although more on this shortly.

defines a type representing the natural numbers with all of the sensible arithmetic, bitwise, and relational operators, essentially everything except bitwise one’s complement… although more on this shortly. #include "math_Integer.h" defines an Integer type with a sign-and-magnitude implementation in terms of Unsigned , with all corresponding operators, including bitwise operators having two’s complement semantics assuming “infinite sign extension.”

defines an type with a sign-and-magnitude implementation in terms of , with all corresponding operators, including bitwise operators having two’s complement semantics assuming “infinite sign extension.” #include "math_Rational.h" defines a Rational type implemented in terms of Integer numerator and denominator.

This was a fun exercise; there were interesting challenges in developing each of the three classes. As discussed previously, the unsigned type handles the actual arbitrary-precision representation (implemented as a vector<uint32_t> of digits in base ), where division is by far the most complex operation to implement efficiently.

The implementation of the signed integer type is relatively straightforward… except for the bitwise operators. Assuming a sign-and-magnitude representation (using an Unsigned under the hood), it is an interesting exercise to work out how to implement bitwise and, or, xor, and not, so that they have two’s complement semantics even for negative operands. In the process, I had to add an “AND NOT” operator to the original underlying unsigned type (there is actually a built-in operator &^ for this in Go).

With this machinery in place, the rational type is the simplest to implement. The only wrinkle here is that a few additional constructors are needed, since user-defined conversions from the more primitive integral types (e.g., Rational from Integer , Integer from int32_t , etc.) are only implicitly applied “one level deep.”

Example application: Are seven riffle shuffles enough?

To test and demonstrate use of these classes, consider riffle shuffling a standard poker deck of 52 playing cards. How many shuffles are sufficient to “fully randomize” the deck? A popular rule of thumb, attributed to Bayer and Diaconis, is that seven riffle shuffles are recommended. (See a longer list of references here, along with some simpler counting arguments that at least six shuffles are certainly necessary.)

This recommendation is based on analysis of the Gilbert-Shannon-Reeds model of a single riffle shuffle, and of the total variation distance between probability distributions and , where is the distribution of arrangements of the deck after GSR riffle shuffles, and is the desired uniform distribution where every arrangement is equally likely. We can compute this total variation distance exactly as a function of the number of shuffles, as demonstrated in the following example code:

#include "math_Rational.h" #include <iostream> using namespace math; Integer factorial(int n) { Integer f = 1; for (int k = 1; k <= n; ++k) { f *= k; } return f; } Integer binomial(int n, int k) { if (0 <= k && k <= n) { return factorial(n) / factorial(k) / factorial(n - k); } else { return 0; } } Integer power(int base, int exp) { Integer n = 1; for (int k = 0; k < exp; ++k) { n *= base; } return n; } Integer eulerian(int n, int k) { Integer r = 0; for (int j = 0; j < k + 2; ++j) { r += (power(-1, j) * binomial(n + 1, j) * power(k + 1 - j, n)); } return r; } Rational total_variation_distance(int cards, int shuffles) { Rational q = 0; for (int r = 1; r <= cards; ++r) { Rational a = Rational( binomial((1 << shuffles) + cards - r, cards), Integer(1) << (cards * shuffles)) - Rational(1, factorial(cards)); q += (eulerian(cards, r - 1) * (a < 0 ? -a : a)); } return q / 2; } int main() { int cards = 52; for (int shuffles = 0; shuffles <= 15; ++shuffles) { std::cout << shuffles << " " << total_variation_distance(cards, shuffles).to_double() << std::endl; } }

The following figure shows the results. Total variation distance ranges from a maximum of one (between discrete distributions with disjoint support) to a minimum of zero, in this case corresponding to an exactly uniform distribution of arrangements of the deck.

We can see the sharp threshold behavior, where total variation distance transitions from near one to near zero over just a few shuffles, first dropping below 1/2 at seven shuffles.