(If you can’t read the code in this post on mobile, try this wrapped version instead, sorry)

Suppose you have two values. You know some bits are zero, some bits are one, other bits could be either. You can add together those two values to get another such partially-known value, determining as precisely as possible which bits can be known and which cannot. This operation is common in optimising compilers and program analysis, and it has some interesting properties that I’ll circle back to, but first here’s my implementation of the addition operation described above:

struct known_bits { unsigned ones; unsigned unknowns; }; struct known_bits kb_add(struct known_bits a, struct known_bits b) { struct known_bits result; unsigned x = a.ones + b.ones; result.unknowns = a.unknowns | b.unknowns | (x ^ (x + a.unknowns + b.unknowns)); result.ones = x & ~result.unknowns; return result; }

This function compiles to about 12 instructions, the assembly for which fits in a tweet. But let’s look at the theory behind it all.

Representation

A perhaps surprising amount of thought went into that simple-looking structure.

Each partially known value is a ternary number (with digits zero, one and unknown), so for a 32-bit number, there are 332 possible values. This requires more than one 32-bit value to represent, but can be comfortably represented with two or three 32-bit values. Initially, I stored “known ones” and “mask”, where the mask was set for all known bits. After figuring out most of my functions, I realised I could remove a significant number of bitwise nots by storing the unknown bits instead of known bits.

I could also store a third value, “known zeroes”, but this can be calculated if needed by the expression ~(v.ones | v.unknowns), and it seemed error prone to make the representation more redundant.

Theory

But that’s enough about code – let’s get back to some maths. Each value can be thought of as a lossy representation of a set of possible numbers. The size of this set is given by 2popcount(unknowns) (meaning if there are no unknown bits it represents a single value, or if there are two unknown bits it represents four possible values).

The representation is lossy. It would take 232 bits to represent an arbitrary subset of the 32-bit integers, and I’m only storing 64 bits. For example, consider the set {000, 011}. This is represented as 0UU (maximising for precision), which represents the superset {000, 001, 010, 011}.

In some sense, the most fundamental operation is the “union” operation:

r.ones = a.ones & b.ones; r.unknowns = a.unknowns | b.unknowns | (a.ones ^ b.ones);

This combines two sets, such that only bits which are known to be the same are known in the result.

The other fundamental operation is iteration. This is done by iterating over values up to the set size (2popcount(unknowns)), and distributing those bits to the unknown bit locations. This can be written efficiently using a technique described in Fabian Giesen’s Texture tiling and swizzling.

Using these two operations, we can define an algorithm for computing the maximally precise known bits after any operation. As a pseudocode example:

r = known_value(op(a.ones, b.ones)) for each possible a as A for each possible b as B r = union(r, known_value(op(A, B)))

This has an exponential runtime, so it’s entirely impractical, but I quite like it as a mathematical definition of the best possible result, and it can be useful for quickly testing better algorithms.

Some Bitwise Operations

Bitwise operations have relatively simple definitions. A nice example is or:

r.ones = a.ones | b.ones; r.unknowns = (a.unknowns | b.unknowns) & ~r.ones;

Any known ones in either input will be known ones in the output, known zeroes in both inputs will give a known zero output, and anything else is unknown. For example: 00011U | 0U1U1U = 0U111U.

Another example is xor.

r.unknowns = a.unknowns | b.unknowns; r.ones = (a.ones ^ b.ones) & ~r.unknowns;

Any unknowns in either input will be unknown, but any two known values are xored. For example: 00011U ^ 0U1U1U = 0U1U0U.

It’s worth remembering that xor is basically addition without the carry.

Addition Operation

Coming back to addition, there are a few things worth noting. Firstly, our addition is not associative, for example:

(0001 + 0001) + 000U = 0010 + 000U = 001U 0001 + (0001 + 000U) = 0001 + 00UU = 0UUU

The addition algorithm shown above comes from a few logical ideas. Firstly, any unknown bit in either input will be unknown in the output (because it can change the result bit in the same location). Secondly, some carry bits may also become unknown, possibly a whole run of them. These unknown bits can be determined by finding difference between the maximum possible sum and the minimum possible sum.

The minimum value in a given set is a.ones and the maximum is a.ones | a.unknowns. So the initial algorithm was:

struct known_bits kb_add(struct known_bits a, struct known_bits b) { struct known_bits result; result.unknowns = a.unknowns | b.unknowns | ((a.ones + b.ones) ^ ((a.ones | a.unknowns) + (b.ones | b.unknowns)); result.ones = (a.ones + b.ones) & ~result.unknowns; return result; }

Alternately, the maximum can be represented as (a.ones + a.unknowns) because no bits are set in both a.ones and a.unknowns, so the addition will never carry. This representation allows the following transformation to the function above:

(a.ones | a.unknowns) + (b.ones | b.unknowns) = (a.ones + a.unknowns) + (b.ones + b.unknowns) = (a.ones + b.ones) + a.unknowns + b.unknowns

The (a.ones + b.ones) expression now appears in three places, so we calculate it ahead of time to simplify things, giving the function shown at the top.

Questions

Can you prove or disprove the addition function? I can’t find a counter-example when exhaustively searching 8-bit values, but that’s not really a proof.

Can you compute the known-bits for multiplication? It’s easy to start getting useful results, but I would love to see a maximally precise less-than-exponential-time algorithm. Can it be proven that that’s impossible?

Can it be simplified further? John Regehr has had interesting results in program synthesis along these lines (which is what reminded me to post this – thanks!)

Update: if you want to play with implementing different operations, I’ve put some test code in a gist, based on a gist by @zwegner. It includes an implementation of subtraction.