Weird Binary Imagine you are in charge of humanities search for extra terrestrials program. One day, after scanning the skies, you find a signal. The signal consists of a series of pulses, and after a little bit of work, you discover that they form an image 197×199 in size. The image contains what looks like simple arithmetic. However, no normal number-system seems to work. 101110 + 100100 = 100010 Welcome to the world of "weird binary". So what exactly is going on? It turns out that even though a number system can be based on two different symbols, it doesn't have to be the traditional binary system we all know and love. These new binary number systems have differing properties, and come with their own strengths and weaknesses. So what defines a binary number system? Firstly we require two symbols. Let us use "0" and "1" for simplicity. We also require a place-based number system, which we will assume operates in the normal right-to-left increasing powers of a single base. Note that we could imagine a left-to-right system, but that just corresponds to using the inverse of a right-to-left base, so doesn't add anything new. Once we have this, we can define a simple multiplication table; anything multiplied by zero is zero, and 1×1=1. Similarly, addition by zero is idempotent. Thus only 1+1=2 needs to be defined, since "2" doesn't fit in our binary system. By choosing different representations of the number 2, we can define different binary number systems. Thus by enumerating these choices we can see which weird binary systems exist. Before we will proceed, it is nice to define a format for such numbers. Unfortunately, the complexity of the addition operation will be quite large. This is due to the carries being unlike those of normal binary numbers. Thus, to simplify the carry calculation, we will provide an integer per bit. Such a layout would look like the following in C: static void print_wb(unsigned *x, int num_size) { int i; char c[2] = {' ', '1'}; for (i = num_size - 1; i > 0; i--) { printf("%c", c[x[i]]); if (x[i] == 1) c[0] = '0'; } printf("%c

", x[0] + '0'); } static void clear_wb(unsigned *x, int num_size) { memset(x, 0, num_size * sizeof(unsigned)); } static void copy_wb(unsigned *x, unsigned *y, int num_size) { memcpy(y, x, num_size * sizeof(unsigned)); } static void carry_wb(unsigned *x, int num_size, unsigned *add_spec, int add_spec_size) { int i, j; unsigned c; for (i = 0; i < num_size; i++) { if (x[i] <= 1) continue; c = x[i] / 2; x[i] &= 1; for (j = 0; (j < add_spec_size) && (j < num_size - 1 - i); j++) { x[j + i + 1] += c * add_spec[j]; } } } static void add_wb(unsigned *x, unsigned *y, int num_size, unsigned *add_spec, int add_spec_size) { int i; for (i = 0; i < num_size; i++) { x[i] += y[i]; } carry_wb(x, num_size, add_spec, add_spec_size); } static void mult(unsigned *x, unsigned *y, int num_size, unsigned *add_spec, int add_spec_size) { int i, j; unsigned t[num_size]; for (i = 0; i < num_size; i++) { t[i] = x[i]; x[i] = 0; } for (i = 0; i < num_size; i++) { if (t[i]) { for (j = 0; j < num_size; j++) { if (j + i > num_size) break; x[j + i] += y[j]; } } } carry_wb(x, num_size, add_spec, add_spec_size); } The above has the form of the weird binary encoded in the value of the number 2. This specification is stored within the add_spec array, of size add_spec_size . So long as the numbers aren't so large that the carries don't overflow an unsigned integer, the above can be used to add and multiply arbitrary precision weird binary numbers. To create weird binary numbers, we can use induction. We know the value of one, and can then add it multiple times to obtain any natural number we require. static void create_wb_natural(unsigned *x, int num_size, unsigned *add_spec, int add_spec_size, int n) { unsigned temp[num_size]; unsigned temp2[num_size]; /* Clear values */ clear_wb(x, num_size); clear_wb(temp, num_size); temp[0] = 1; /* Log2(n) steps */ while (n) { /* Is the current bit set? */ if (n & 1) add_wb(x, temp, num_size, add_spec, add_spec_size); /* Move to the next bit */ n = n / 2; /* Double the size of the thing to add */ copy_wb(temp, temp2, num_size); add_wb(temp, temp2, num_size, add_spec, add_spec_size); } } These C functions assume that the symbolic representation for the number two lies solely to the left of the radix point. This means that carries only propagate to the left. Allowing carries to propagate to the left and right simultaneously makes it rather difficult to construct an addition routine. (In effect it is possible to make a linear cellular automata this way, and such things are subject to the halting problem.) Of course, some cases are solvable. i.e. The case where 1+1=10.1 yields b+1/b=2, and thus b=1. This base-one solution also appears in other situations, see the 1+1=11 case below. 1+1=0 The first possibility results in a rather boring number system. In this system, addition works like the xor operator. Multiplication still mixes number places together. However, no carries take place, so all number positions operate as if they are alone with the "addition" step. This system is somewhat useful in cryptography, and Intel has added the PCLMULQDQ instruction to perform this operation. 1+1=1 This number system is even less interesting, with addition acting like a logical-or. Once you have a 1, you can never remove it. This means that negative numbers cannot exist in this system, since you can never add two non-zero numbers to get zero. 1+1=10 This is the traditional binary number system used in computers. By using twos-complement arithmetic, we can represent negative numbers. All normal mathematical operations work as you would expect in this system, (unlike the previous two systems). However, as you can see, this isn't the end of the story, as several more interesting binary systems exist. 1+1=x1 Any other binary system that defines two as ending with a 1 is problematic. In such systems, you cannot form minus one. (In other words, the equation 1 + y = 0 has no solution.) This severely restricts the usefulness of such systems. However, there is one system which is of note. 1+1=11 describes the "stick counting" system of natural numbers. The more sticks you have, the larger the number. The total number of sticks exactly corresponds to the number you have. Another way of looking at this system, is saying that it is "base 1". Unfortunately, working in base 1 is extremely inefficient, as the exponential savings in symbol compression don't happen. i.e. 999 only requires three symbols to write in decimal... but would require 999 symbols in base 1. Since all the interesting systems will define two ending with a zero symbol, we can now evaluate what negative numbers are. This can be done by first calculating what minus one is, and then multiplying that by the correct natural number. static void find_m1(unsigned *m1, int num_size, unsigned *add_spec, int add_spec_size) { int i, j; unsigned t[num_size]; clear_wb(m1, num_size); clear_wb(t, num_size); x[0] = 1; for (i = 0; i < add_spec_size; i++) { if (i + 1 > num_size) break; t[i + 1] = add_spec[i]; } for (j = 1; j < num_size; j++) { if (!t[j]) continue; m1[j] = 1; for (i = 0; i < add_spec_size; i++) { if (i + 1 + j > num_size) break; t[i + j + 1] += add_spec[i]; } carry_wb(t, num_size, add_spec, add_spec_size); } } static void create_wb_integer(unsigned *x, int num_size, unsigned *add_spec, int add_spec_size, int n, unsigned *m1) { /* Are we a natural number? */ if (n >= 0) return create_wb_natural(x, num_size, add_spec, add_spec_size, n); /* Negative number */ create_wb_natural(x, num_size, add_spec, add_spec_size, -n); /* Multiply by minus one */ mult(x, m1, num_size, add_spec, add_spec_size); } 1+1=100 Using the above, we have b2=2, where b is the base. Thus, we find that the bases that satisfy this system are ±√2. This system thus can exactly store values proportional to the square root of two. This comes at a cost, numbers written in this form are twice as large as in normal binary. Normal binary can approximate the square root of two by including digits below the radix point. By using enough digits, the error can be made as small as needed. This means that this system is only really useful if exact calculations are required, and isn't really useful. 1+1=110 This system is defined by b2+b=2, and thus b=1, or b=-2. It turns out that the b=1 solution is spurious, leaving the base = -2, or "negabinary" number system. This number system is fairly well known. See Knuth's Art of Computer Programming, Seminumerical Algorithms for a discussion of its properties. Since this case is very similar to normal binary, (it is only the odd positions that vary), there exist some fast algorithms to convert from binary to negabinary and back. There are also fast ways to add, multiply and divide these numbers. Such numbers are more uniform than normal binary. No twos-complement trick is required to represent negative numbers. Thus there is no longer a difference between signed and unsigned multiplication. A table of the representation of such numbers is: -16: 110000 -15: 110001 -14: 110110 -13: 110111 -12: 110100 -11: 110101 -10: 1010 -9: 1011 -8: 1000 -7: 1001 -6: 1110 -5: 1111 -4: 1100 -3: 1101 -2: 10 -1: 11 0: 0 1: 1 2: 110 3: 111 4: 100 5: 101 6: 11010 7: 11011 8: 11000 9: 11001 10: 11110 11: 11111 12: 11100 13: 11101 14: 10010 15: 10011 16: 10000 1+1=1000 This system is somewhat similar to the case where 1+1=100. It is a "fat binary" system, where there are multiple representations for the same numbers. In this case, the base is the cubic root of two, instead of the square root. The result is that numbers take three times as much space as normal. Such a system is only useful if you need exact arithmetic with such numbers. 1+1=1010 This case has b3+b=2, and has the following three solutions b=1, b=(-1±i√7)/2. The first base-one case is again spurious, leaving the two bases that are complex conjugates of each other. These bases allow one to calculate complex arithmetic using a single binary string. (The previous case also allowed complex solutions, but symmetry prevented them from being usefully different from the real case.) Integers in this representation look like: -16: 101110000 -15: 101110001 -14: 101111010 -13: 101111011 -12: 101010100 -11: 101010101 -10: 101011110 -9: 101011111 -8: 101011000 -7: 101011001 -6: 100010 -5: 100011 -4: 111100 -3: 111101 -2: 110 -1: 111 0: 0 1: 1 2: 1010 3: 1011 4: 11100100 5: 11100101 6: 11101110 7: 11101111 8: 11101000 9: 11101001 10: 11100110010 11: 11100110011 12: 11001100 13: 11001101 14: 11100010110 15: 11100010111 16: 11100010000 Notice how the length of the numbers doesn't increase monotonically as they increase away from zero. This is due to the fact that the boundary of numbers of a given symbol length in this system is a fractal on the complex number plane: 1+1=1100 This case has b3+b2=2, resulting in b = 1, -1±i. Again, the base one solution is spurious. The resulting systems are quite well known, and are often presumed as the only complex binary ones. As can be seen by the previous case, this isn't so. See the book "Hacker's Delight" for a description, and the inverse problem of solving for the number two in these systems. Integers in this representation are: -16: 1110100000000 -15: 1110100000001 -14: 1110100001100 -13: 1110100001101 -12: 11010000 -11: 11010001 -10: 11011100 -9: 11011101 -8: 11000000 -7: 11000001 -6: 11001100 -5: 11001101 -4: 10000 -3: 10001 -2: 11100 -1: 11101 0: 0 1: 1 2: 1100 3: 1101 4: 111010000 5: 111010001 6: 111011100 7: 111011101 8: 111000000 9: 111000001 10: 111001100 11: 111001101 12: 100010000 13: 100010001 14: 100011100 15: 100011101 16: 100000000 The symbol lengths for numbers in this system are increasing more rapidly than in the previous case. This is due to the fact that the previous system in effect de-weights imaginary numbers by a factor of the square root of seven. This weighting means that the previous system cannot exactly represent numbers such as the imaginary unit, i. In exchange for the larger verbosity, the current system doesn't suffer this problem. Again, it turns out that the symbol lengths do not increase monotonically away from zero. The numbers of a given length form a "Dragon Fractal": 1+1=1110 This final case where the number two is represented by four symbols isn't particularly interesting. The cubic equation for the base produces horribly messy solutions. Only when you would like exact arithmetic with such a base would this system be better than others already discussed. 1+1=10000 This is yet another fat-binary case, where the base is the fourth root of two. Other than that, it is uninteresting. 1+1=10010 This yields a quartic equation, which unfortunately produces messy solutions just like the 1+1=1110 case. It isn't useful. 1+1=10100 This gives the equation b4+b2=2, having the solution b=±1,±i√2. Ignoring the non-imaginary solutions yields the interesting case of a pure-imaginary base. (Which one of the two we choose doesn't matter due to symmetry under complex conjugates.) This number system is also mentioned in Knuth's Art of Computer Programming, in where he states that its disadvantage over the -1+i system is the fact that the complex unit is represented by an infinitely long non-repeating string. However, if we use floating-point, then the small error introduced is usually ignorable. This is especially true because this system has no trouble exactly representing integers: -16: 10100000000 -15: 10100000001 -14: 10100010100 -13: 10100010101 -12: 10100010000 -11: 10100010001 -10: 1000100 -9: 1000101 -8: 1000000 -7: 1000001 -6: 1010100 -5: 1010101 -4: 1010000 -3: 1010001 -2: 100 -1: 101 0: 0 1: 1 2: 10100 3: 10101 4: 10000 5: 10001 6: 101000100 7: 101000101 8: 101000000 9: 101000001 10: 101010100 11: 101010101 12: 101010000 13: 101010001 14: 100000100 15: 100000101 16: 100000000 This complex number system technically is also fractal, but the system of nested rectangles isn't particularly complicated: 1+1=10110 ... 1+1=100010 These systems are again like 1110 and 10010, except with quartic and quintic equations needing to be solved. The resulting solutions are complex functions containing several nested roots, and as as result do not make very interesting bases. 1+1=101110 This case has b5+b3+b2+b=2, which amongst its solutions has b=(1±i√7)/2. Thus this case is very similar to that of 1+1=1010, and has the integer table: -16: 110110000 -15: 110110001 -14: 101011110 -13: 101011111 -12: 101010100 -11: 101010101 -10: 101010010 -9: 101010011 -8: 101101000 -7: 101101001 -6: 101110110 -5: 101110111 -4: 1100 -3: 1101 -2: 1010 -1: 1011 0: 0 1: 1 2: 101110 3: 101111 4: 100100 5: 100101 6: 100010 7: 100011 8: 1111000 9: 1111001 10: 10111000110 11: 10111000111 12: 101100011100 13: 101100011101 14: 101100011010 15: 101100011011 16: 101100010000 And fractal boundary of: This is the base used by the aliens described in the introduction. Their crazy mathematical statement is simply showing that 2+4=6. Creating complex numbers in weird binary As was shown in the beginning, creating natural numbers is easy, induction can be used to create any number once we have the definition of the number two. Integers can be created once we can evaluate what minus one is, which again only depends on the definition of the number two. Unfortunately, complex numbers aren't so simple. There, we need to know which of the possibly many solutions for the base we are using in order to obtain a value for the imaginary unit, i. It turns out that the last number system described contains the identity i√7=1+b+b2, when b = (1+i√7)/2. Thus we can simply divide by the square root of seven, and use induction again to evaluate any pure imaginary integer. There is one other case though, the fact that the base includes the factor of 1/2 means that this system also has half-integer complex numbers. These can be created by adding or subtracting the base, and simplifying the problem into the integer case. A C function which does this is: static void b_to_wb(unsigned *x, int num_size, unsigned *add_spec, int add_spec_size, complex double n, unsigned *m1) { double r = creal(n); double im = cimag(n); double eps = 1e-8; unsigned temp[num_size]; unsigned imval[num_size]; /* Clear output */ clear_wb(x, num_size); clear_wb(imval, num_size); /* Are we not an integer? */ if (fabs(floor(r + eps) - r) > eps) { r += 0.5; im += 0.5*sqrt(7.0); x[1] = 1; } /* Create real part */ create_wb_integer(temp, num_size, add_spec, add_spec_size, (int) r, m1); add_wb(x, temp, num_size, add_spec, add_spec_size); /* Scale imaginary part */ im /= sqrt(7.0); imval[0] = 1; imval[1] = 1; imval[2] = 1; /* Create imaginary part */ create_wb_integer(temp, num_size, add_spec, add_spec_size, (int) im, m1); mult(temp, imval, num_size, add_spec, add_spec_size); /* Combine real and imaginary parts */ add_wb(x, temp, num_size, add_spec, add_spec_size); } For other weird binary bases, such as b=-1+i, the procedure is somewhat different, especially in that case, where i can be represented exactly. The reverse process, of converting a weird binary number back to binary is relatively simple. We just add the relevant powers of the base. C code that does this is: static complex double wb_to_b(unsigned *x, int num_size, complex double base) { int i; complex double b = base; complex double out = x[0]; for (i = 1; i < num_size; i++) { out += b * x[i]; b *= base; } return out; } Conclusion Are there any other interesting bases for weird binary? It turns out that no, there aren't. For a base to be interesting, its complex squared norm must be equal to two. A pure imaginary base with this is the b=±i√2, discussed above. Similarly, there is the pure real case b=±√2 also described. This leaves complex cases. Normalizing, we have: b = [±1±i√ (2n2-1)]/n If we evaluate b2, we have: b2=±[1-n2±i√ (2n2-1)] ×2/n2 We need 2/n2 to be in lowest common terms to be a multiple of 1/n. If this isn't the case, then we can never build a terminating expression that evaluates to be the number two. The denominators will increase without limit as we increase the powers of the base, and no cancellation will occur. To prevent that, n needs to be 1 or 2. If n is one, then we get b = ±1 ±i, and if n is two, we obtain the other pure complex solutions; b = (±1±i√7)/2. So binary number systems can be quite complicated. However, they unfortunately cannot represent quaternions or ocotonions due to the roots of polynomials being closed under the complex numbers. But still, as can be seen, there are several complex binary number systems, some more well known than others. Lockless

Articles

Weird Binary