$m$-of-$n$ Boolean Circuits

(This post contains code. For a complete code listing, see the very end of this post.)

The Problem

Given a set of boolean variables $X=\{x_1, \ldots, x_n\}$, construct a boolean circuit that determines if at least $0\le m\le n$ elements of $X$ are true.

Let’s call it the $m$-of-$n$ boolean circuit problem. (Also known as threshold gates and extremely related to majority gates, for $\lceil\tfrac{n}{2}\rceil$-of-$n$ circuits.) We are assuming we know nothing about the probabilities of any of the variables’ “truthiness” beforehand.

Here, by boolean circuit, I mean a boolean expression. In this case, I’ll mean an expression that uses conjunction ( and , $x\land y$), disjunction ( or , $x\lor y$), negation ( not , $\lnot x$), conditional branching ( if , notation shown below), and the constants true ( t , $\top$) and false ( nil , $\bot$). The astute reader would note that these are not entirely independent. For example, suppose we have the expression “if $p$ then $x$ else $y$” be represented mathematically as

\[ p\left\{\begin{matrix} x \\ y \end{matrix}\right\}.\] One can see that

\[p\left\{\begin{matrix} x \\ y \end{matrix}\right\} \iff (p\land x)\lor(\lnot p \land y).\] One can also represent $\bot$ as $x\land\lnot x$ for all $x$ (though note, this doesn’t fit our language scheme and requires universal quantification).

We can go even further to minimize the number of different operators. However, the motivation of the problem is pragmatic: how can we devise a way to efficiently compute it in a computer program? We might also take the “short-circuit”-style of evaluation into consideration.

Towards a Solution

First, let’s look at an example. Suppose we have variables $a$, $b$, and $c$, and we wish to check if at least two of these are true. One naive solution might be

\[(a\land b)\lor(b\land c)\lor(a\land c).\]

This leads to a solution for the $m$-of-$n$ problem, namely compute the disjunction of the conjunction of every subset of $X$. Mathematically,

\[\bigvee_{\substack{S\subseteq X\\ \vert S\vert=m}} \bigwedge_{x\in S}x.\]

However, this can lead to many redundant evaluations. Consider the 2-of-3 problem and the expression above. Also consider $a=\top, b=\bot$. Then we compute $a\land b$, which we see is false, and proceed to compute $b\land c$. With the usual style of evaluation, $b$ gets evaluated twice.

Optimizing

We can do better in fact. The above can simplified to

\[a\left\{\begin{matrix} b\lor c \\ b\land c \end{matrix}\right\}.\]

If evaluating in a short-circuit fashion, we see $a$ and $b$ always get evaluated once, and $c$ only gets evaluated if necessary.

With this in mind, we might restate our goals very slightly. Instead of focusing on minimizing the circuit, focus on evaluating the least number of boolean variables. Cursory inspection tells us we need $\omega(m)$ evaluations. Ideally, we can bound the number from above as well, keeping the number of evaluations in $O(n)$.

The above example for the 2-of-3 problem suggests a recursive solution. Suppose $f_m(x_1,\ldots,x_n)$ decides if at least $m$ of the total variables is true. Then we might check that $x_1$ is true, and if so, continue checking with the next variable and keeping a count, reducing the $m$-of-$n$ problem to an $(m-1)$-of-$(n-1)$ problem. If $x_1$ happens to be false, then we reduce to an $m$-of-$(n-1)$ problem.

When do we stop? Well, clearly, a $0$-of-$n$ problem is just simple true. However, if we have an $m$-of-$0$ problem for $m

eq 0$, it is impossible to result in a true value, so it must be false.

The scenarios above lead to the general case and base case respectively, giving us:

\[f_m(x_1,\ldots,x_n) =

\begin{cases}

\top &\mbox{if }m=0,\\

\bot &\mbox{if }n=0,\\

x_1\left\{\begin{matrix} f_{m-1}(x_2,\ldots,x_n) \\ f_{m}(x_2,\ldots,x_n) \end{matrix}\right\}&\mbox{otherwise.}

\end{cases}\]

Simple use of induction tells us that $f$ always terminates.

The Pragmatic Test: Codification

We can simply codify this in Lisp and most other languages:



(defun any-m-of-n (m &rest x) "Check that any M of the inputs X are T." (check-type m integer) (labels ((f (m x) (cond ((zerop m) t) ((null x) nil) ; is n = 0? ((first x) (f (1- m) (rest x))) (t (f m (rest x)))))) (f m x)))

This is good, but we can do better. If we reduce to an $m$-of-$n$ problem where $m>n$, then we can return false. We can figure this out efficiently not by recomputing $n$ each time (i.e., computing (length X) ), but by internally remembering $n$ with $f$. In other words, we have $f_{m,n}(x_1,\ldots,x_n)$ explicitly now.

Our mathematical expression becomes

\[f_{m,n}(x_1,\ldots,x_n) =

\begin{cases}

\top &\mbox{if }m=0,\\

\bot &\mbox{if }m>n,\\

x_1\left\{\begin{matrix} f_{m-1,n-1}(x_2,\ldots,x_n) \\ f_{m,n-1}(x_2,\ldots,x_n) \end{matrix}\right\}&\mbox{otherwise,}

\end{cases}\]

and our code becomes



(defun any-m-of-n (m &rest x) "Check that any M of the inputs X are T. Fail as soon as we reach fewer inputs than M." (check-type m integer) (labels ((f (m n x) (cond ((not (plusp m)) t) ; *** ((> m n) nil) ((first x) (f (1- m) (1- n) (rest x))) (t (f m (1- n) (rest x)))))) (f m (length x) x)))

On the line marked with *** , we are checking if $m$ non-positive instead of simply non-zero. This will turn out to be useful later.

All of this is good, but we actually aren’t following the rules. While $f$ does indeed solve the problem, programmatically we are still eagerly evaluating all of $X$. As such, we might generate a boolean expression instead.

Generating a Boolean Circuit

Now we will focus on compilation more than evaluation. We wish to generate a boolean circuit which evaluates the minimum number of arguments it needs to, without knowing information about the arguments a priori. That is, we want something that does the following:



> (build-circuit 2 '(a b c)) (IF A (OR B C) (AND B C))

We might see the function $f$ that we wrote in Lisp (as any-m-of-n ) as a sort of interpreter. It executes everything on-the-fly.

Instead of executing and checking everything on-the-fly, we replace that computation with a different computation that builds a computation. In other words, instead of checking that $x_1$ is true, output code that does that check.

We could jump right in and do something like the following:



(defun build-naive-circuit (m x) "Build a boolean circuit which checks if M of the inputs X are T." (check-type m integer) (check-type x list) (labels ((f (m n x) (cond ((not (plusp m)) t) ((> m n) nil) (t `(if ,(first x) ,(f (1- m) (1- n) (rest x)) ,(f m (1- n) (rest x))))))) (f m (length x) x)))

Note here that instead of actually computing the if , we generate the if , and recursively build up until we get a true or false value.

However, note the kinds of outputs:



CL-USER> (build-naive-circuit 3 '(a b c)) (IF A (IF B (IF C T NIL) NIL) NIL) CL-USER> (build-naive-circuit 2 '(a b c)) (IF A (IF B T (IF C T NIL)) (IF B (IF C T NIL) NIL))

While these are correct, they aren’t particularly nice. For example, the first one could be (and a b c) and the second one should be like the example before.

There are obvious, glaring “issues”. For example, one doesn’t write (if a t nil) , one just simply writes a .

Fortunately, we can solve these problems at the source instead of writing any sort of simplifier functions.

Simplifying the Generated Code

Consider the 1-of-1 problem. What is the solution to this problem?

If we have a variable $x_1$, it’s just $x_1$. It will evaluate to true or false.

Consider the $n$-of-$n$ problem. What is the solution to this problem?

A simple and ! If we want to check that all $n$ variables are true, we check that every one is true. Sort of a tautology. In any case, we have

\[f_{n,n}(x_1,\ldots,x_n) = \bigwedge_{k=1}^n x_i.\]

In Lisp, we have (and x1 x2 ... xN) .

Consider the 1-of-$n$ problem. What is the solution to this problem?

A simple or ! If we want to check that any one of $n$ variables is true, we check one-by-one until we hit a truth value. We have

\[f_{1,n}(x_1,\ldots,x_n) = \bigvee_{k=1}^n x_i.\]

In Lisp, we have (or x1 x2 ... xN) .

With these three problems solved, we can enhance our code generator to handle the cases specially.

(defun build-circuit (m x) "Build a boolean circuit which checks if M of the inputs X are T." (check-type m integer) (check-type x list) (labels ((f (m n x) (cond ((and (= 1 m) (< 1 n)) `(or ,@x)) ; 1-of-N problem ((= 1 m n) (first x)) ; 1-of-1 problem ((= m n) `(and ,@x)) ; N-of-N problem (t `(if ,(first x) ; M-of-N problem ,(f (1- m) (1- n) (rest x)) ,(f m (1- n) (rest x))))))) (let ((n (length x))) (cond ((not (plusp m)) t) ((> m n) nil) (t (f m n x))))))

Note in this code, the base cases were changed slightly. The very primitive cases, such as $m>n$ and $m\le 0$ were moved “higher” in the call chain to get them out of the way immediately. Otherwise, our base cases in $f$ now reflect the problems from above.

Now we get better output.

CL-USER> (build-circuit 0 '(a b c d e)) T CL-USER> (build-circuit 1 '(a b c d e)) (OR A B C D E) CL-USER> (build-circuit 2 '(a b c d e)) (IF A (OR B C D E) (IF B (OR C D E) (IF C (OR D E) (AND D E)))) CL-USER> (build-circuit 3 '(a b c d e)) (IF A (IF B (OR C D E) (IF C (OR D E) (AND D E))) (IF B (IF C (OR D E) (AND D E)) (AND C D E))) CL-USER> (build-circuit 4 '(a b c d e)) (IF A (IF B (IF C (OR D E) (AND D E)) (AND C D E)) (AND B C D E)) CL-USER> (build-circuit 5 '(a b c d e)) (AND A B C D E) CL-USER> (build-circuit 6 '(a b c d e)) NIL

Note in the generated code, any and every variable gets evaluated at most once, and no more variables are evaluated than needed. We are done!

Going Further with Lisp: Trivial Optimizations

One might see this as a useful compiler optimization. Perhaps it is, if one computes this kind of function a lot.

We can actually build this into the compiler by telling it to expand a any-m-of-n into the code we made a generator for. And if we are going to bother with that kind of optimization, we might as well do something more trivial.

First, it would be nice to remove any false values at compile time. That is, go through $X$ and remove any values which are false before generating code. This is easy enough:



(remove-if #'null x)

We can do another sort of optimization: remove any values that are known to be true. However, there’s a small detail: we also need to decrease $m$ if we do that.

Anyway, that’s easy enough. To tell if a value if a value is “true-like”, we check if it’s both constant and not false.

(defun true-like (thing &optional environment) (and (constantp thing environment) (not (null thing)))) (defun remove-true-like-forms (things &optional environment) (values (remove-if (lambda (x) (true-like x environment)) things) (count-if (lambda (x) (true-like x environment)) things)))

Note that in the last function, we are returning both the list free of true-like forms, and how many are getting removed.

We can envision, now, using these two pieces of code in order to scrub the input before we pass it to build-circuit .

Devising a Compiler Macro

We’ve done all the hard work, now we want to integrate it into the Lisp compiler. Fortunately, this in incredibly easy: we just write a compiler macro! But there are a few preliminaries.

Suppose we want to expand (any-m-of-n m x1 ... xN) . We can only do so if m is a literal integer. Why? Well, we can’t perform our code expansion if we don’t know when to stop expanding; that is, when we expand our code, we do it recursively, inductively on m . As such, in our macro, we need to check for that.

We can, however, “scrub” the list of true and false values. Even if m isn’t an integer.

As such, we’ll have something along these lines:

Compute $x’ = \text{remove false values from }(x_1,\ldots,x_n)$ Compute $(x^{\prime\prime},k) = \text{remove true values in }x’,\text{return number removed}$ If $m$ is an integer, build the circuit for the $m-k$ with inputs $x^{\prime\prime}$. This is why checking if $m$ is non-positive is useful; $m-k$ might be negative! If $m$ is not an integer, and $k>0$, then fall back and compute any-m-of-n eagerly, using the “interpreter” style of function, with $m-k$ and the new inputs $x^{\prime\prime}$. If all else fails, just return the form back as is. This shouldn’t happen.

Translating this into lisp, we have



(define-compiler-macro any-m-of-n (&whole form m &rest x &environment env) (multiple-value-bind (scrubbed-x truths) (remove-true-like-forms (remove-if #'null x) env) (cond ((integerp m) (build-circuit (- m truths) scrubbed-x)) ((plusp truths) `(any-m-of-n (- ,m ,truths) ,@scrubbed-x)) (t form))))

which is almost a one-to-one translation of the above.

Let’s check.



CL-USER> (funcall (compiler-macro-function 'any-m-of-n) '(any-m-of-n 4 'a 'b 'c d nil c d e f g) nil) (OR D C D E F G) CL-USER> (funcall (compiler-macro-function 'any-m-of-n) '(any-m-of-n 5 'a 'b 'c d nil c d e f g) nil) (IF D (OR C D E F G) (IF C (OR D E F G) (IF D (OR E F G) (IF E (OR F G) (AND F G))))) CL-USER> (funcall (compiler-macro-function 'any-m-of-n) '(any-m-of-n n 'a t 'c nil c d e f g) nil) (ANY-M-OF-N (- N 3) C D E F G)

Looks good!

Conclusion

As with most of my posts, a problem ends up being about how Lisp can solve it. We have taken a problem, the $m$-of-$n$ boolean circuit problem, and solved it abstractly, and then coded it in Lisp. Not only did we solve the problem in Lisp, but we enhanced the Lisp compiler with optimizations at compiler time from the analysis of the problem.