A visualization of a 2x2x2 downsample of a labeled dataset. This is what COUNTLESS 3D does.

Previously, I demonstrated a fully vectorized algorithm, COUNTLESS, that downsampled labeled images by finding the mode of 2x2 patches without counting pixel value frequencies. Here, I generalize COUNTLESS to 3D images using a 2x2x2 patch and to arbitrary downsample factors. COUNTLESS 2D worked by finding a majority pixel without computing its total frequency. This was determined by finding the presence or absence of two matching pixels within a 2x2 image patch. COUNTLESS 3D extends this logic by searching among eight voxels (three dimensional pixels) for matches of four pixels, then of three, and finally of two. COUNTLESS N generalizes this technique to any input size, though it rapidly becomes impractical COUNTLESS 2D and COUNTLESS 3D can be seen as application specific implementations of COUNTLESS 4 and COUNTLESS 8 respectively. COUNTLESS variants are implemented using only vectorized instructions, which makes them practical for implementation in programming languages like Python that have fast linear algebra libraries like Numpy, but which have slow looping and branching statements. While COUNTLESS 2D can outperform naive algorithms, using COUNTLESS 3D for high throughput applications is not advisable as it is much slower than the standard approach. However, in Python, it can be implemented without resorting to extra compilation steps and still provides reasonable performance.

Motivation

Figure 1. Volumetric images used in connectomics research, a field where we attempt to find the wiring diagrams of brains. Left: A 3D image formed from a stack of electron microscope scans of brain tissue. Right: A machine labeling of neurons in the left image by Kisuk Lee. This article concerns the type of image on the right.

While not commonly encountered by the average person, 3D images ( volumetric images) are heavily used in biomedical imaging. A volumetric image can be constructed from a stack of 2D images acquired at regularly deepening intervals. MRI machines use magnets to non-invasively acquire images of brain slices, and cell biologists often use laser powered microscopes to scan samples at different depths. Their acquired images are arranged in a stack in sorted order to form the final image.

My tissue of interest is a brain and the method for acquiring those images is to slice it very finely using a machine akin to a deli slicer called an ultramicrotome. The resulting slices are then imaged with an electron microscope and assembled into a stack to generate a volumetric image. The slices are large, and range between tens of thousands to hundreds of thousands of pixels per a side. By contrast, a high end 4K resolution television or computer monitor displays images only three to four thousands pixels per a side, and often the other side is half as large.

In order to display these images using consumer hardware, it is common practice to downsample them, that is to create a series of smaller summary images that fairly represent the underlying high resolution image. For instance, on Google Maps, the world is displayed as a single (or small number of) stitched images, but as you zoom in, higher resolution images are fetched and displayed of only the region of interest. With microscope images, we can downsample them by averaging 2x2x2 patches to create an eight times smaller image, and do so iteratively until the image is sufficiently small. However, once we generate labels to describe which voxel belongs to which neuron, averaging is not an option as the labels are not analog signals, but discrete identifiers. The most suitable way to summarize them is to choose the most frequent value within a patch.

The storage of downsamples incurs additional cost, though it exponentially declines with each additional mip level. For a recursive 2x2 reduction, the storage and transmission cost of a stack of downsamples is 33% over the cost of storing the full resolution layer.

LISTING 1: Size of an Infinite Stack of Downsamples Let S = The size of all the downsamples

Let r = The reduction factor of a downsample (e.g. 4 for 2x2) S = 1 + 1/r + 1/r^2 + 1/r^3 + … S/r = 1/r + 1/r^2 + 1/r^3 + 1/r^4 + … S — S/r = 1 S = r / (r-1)

Therefore, the storage cost of a 2x2 downsample stack is at most 4/3 or 133% of the cost of the original image alone. The storage cost of a 2x2x2 downsample stack is at most 8/7 or 114.3% of the full resolution. For some use cases, this reduction can be enticing. For those so enticed, the question turns to how it might be achieved without counting?

COUNTLESS 5 — A Small Extension of the 2D Problem

At its root, COUNTLESS 2D relied on the idea that if two labels in a set of four match, it was not necessary to consider any additional information to declare the mode of that patch. It’s natural to ask if a similar principle can be stated for a set of five labels, the smallest possible extension of this problem.

Figure 2. The seven cases of COUNTLESS 5. (a) all labels match (b) all but one label matches (c) three labels match versus two labels (d) three match but the other two are both different (e) two pairs and one different (f) one pair and three different pixels (g) all different

In the case of four labels, we found that if any two match, they immediately form either a majority or a tie, in which case a member of either of the tied groups can be selected arbitrarily. In the case of five labels, a match of two labels is no longer sufficient to declare a winner, however three are. If three labels match, they will always form a majority with no ties possible. However, if there are no matches of three, then the mode will be comprised of matches of two. If no two labels match, then a label can be chosen arbitrarily.

Therefore, we must by some mechanism look for matches of three, if none exist, then matches of two, if there are no matches of two, then having thrown up our hands, select one convenient pixel. Searching for matches of three means checking every combination of three labels among the five and likewise for matches of two.

LISTING 2: The Relevant Combinations of Five Labels Let the five integer labels be denoted A, B, C, D, E Combinations of three: 5C3 = 5! / 3!2! = 10 combinations ABC, ABD, ABE,

ACD, ACE,

ADE,

BCD, BCE,

BDE,

CDE Combinations of two: 5C2 = 5! / 2!3! = 10 combinations AB, AC, AD, AE

BC, BD, BE

CD, CE

DE Combinations of one: 5C1 = 5 combinations A, B, C, D, E

To evaluate each combination, we can generalize the PICK(A,B) operator from COUNTLESS 2D to accept any number of arguments:

PICK(A,B,C, ..., N) := A if A == B == C == ... == N else 0 EQN. 1

This can be implemented in Python/numpy pseudocode (“&” is the bitwise AND operator):

PICK(A,B,C ..., M,N) := A * (A == B & B == C & ... & M == N) EQN. 2

Each application of PICK to a given combination will yield a label or zero. If any of our combinations of three yields non-zero, it qualifies as a candidate mode. If there is more than one match, for instance if there are actually four matching labels, we can pick any of the candidate modes arbitrarily. The short circuiting logical or operator has the right semantics to select a label if one exists.

MODE(A,B,C,D,E) := PICK(A,B,C) || PICK(A,C,D) || PICK(A,D,E)

|| PICK(B,C,D) || PICK(B,D,E) || PICK(C,D,E) || PICK(A,B)

|| PICK(A,C) || PICK(A,D) || PICK(A,E) || PICK(B,C)

|| PICK(B,D) || PICK(B,E) || PICK(C,D) || PICK(C,E)

|| PICK(D,E) || E EQN. 3

As described previously, the || operator can be implemented like so:

LOGICAL_OR(X,Y) := X + (X == 0) * Y EQN. 4

Equation 3 has seventeen elements that need to be calculated and logically ORed together. Is there some way to reduce the number of elements needed? It’s not possible to substantially reduce the time complexity of the general algorithm, but there are some marginal savings that make a difference for small numbers of labels. Notice for the case of matches of two we can extend the trick used in COUNTLESS 2D to avoid computing matches of the last element. Using the notation PQ to mean PICK(P,Q), if neither AB, AC, AD, BC, BD, nor CD is a match, then we would be forced to choose one of AE, BE, CE, DE, or E, all of which are identical to E. Therefore we can omit AE, BE, CE, and DE leaving thirteen elements to be calculated, a substantial savings.

MODE(A,B,C,D,E) := PICK(A,B,C) || PICK(A,C,D) || PICK(A,D,E)

|| PICK(B,C,D) || PICK(B,D,E) || PICK(C,D,E)

|| PICK(A,B) || PICK(A,C) || PICK(A,D) || PICK(B,C)

|| PICK(B,D) || PICK(C,D) || E EQN. 5

This will always be true for the last label, no matter how large the set of labels we are considering. That means for the case of N choose two, we can always reduce it to N-1 choose two combinations to consider.

Putting this all together yields some working Python code for COUNTLESS 5. Note that this is just a stepping stone to COUNTLESS 3D, perhaps better called COUNTLESS 8 as it is solving the mode of 2 x 2 x 2 voxels.

COUNTLESS 3D (aka COUNTLESS 8)

COUNTLESS 3D, the practical application of this concept is now within reach. Downsampling a 2x2x2 volumetric image patch is equivalent to finding the mode of eight integers. The bare requirement for a candidate majority or tie is four matching labels. If there is a tie of 4–4, we can pick one of the matches arbitrarily. To evaluate COUNTLESS 8, we must consider matches of length four, then three, then two.

Equation 6 below shows the number of PICK calls that must be made. Note that the 7C2 term below comes from our simplification of 8C2 using the method shown in the COUNTLESS 5 section.

Total PICKs = 8C4 + 8C3 + 7C2

= 70 + 56 + 21

= 147 EQN. 6

Note for now that, PICK(A,B,C,D) is 1.5x more expensive than PICK(A,B,C) (six operations versus four) and 3x more expensive than PICK(A,B) (six operations versus two). This fact will be important later.

Below is the implementation of COUNTLESS 8. Note that the creation of 148 PICKS would swell the required memory if we are not careful. Using Python 3 generator expressions, we can reduce the memory requirement of this program substantially by only creating a few additional derived images at a time.

Similarly to COUNTLESS 2D, the output of the PICK operator is nonsensical if the matching labels are zero (it returns 0 whether they match or not), so we shift the data up by one to accommodate zero labels and shift down at the end.

Dynamic Programming COUNTLESS 3D

So far so good, but could speed this up a bit more. As shown in Equation 6, the performance of COUNTLESS 3D is roughly proportional to the number of PICK operations that must be computed, which in turn equal to the number operations that must be computed. PICK(A,B,C,D) requires six operations, while PICK(A,B,C) requires four, and PICK(A,B) requires two. We also need to compute three operations per a logical or.

LISTING 3: Number of Operations Required for COUNTLESS 3D # of COUNTLESS_3D operations

= 8C4 PICK(A,B,C,D) + 8C3 PICK(A,B,C) + 7C2 PICK(A,B)

+ (8C4 + 8C3 + 7C2 + 1 - 1) logical ors

= 8C4 x 6 + 8C3 x 4 + 7C2 x 2

+ (8C4 + 8C3 + 7C2) x 3

= 686 + 441 // picks + logical or = 1,127 operations

Or more generally: