Not everything is an expression

"Everything is an expression" is a slogan of several programming languages, particularly those descended from the Lisp family. It is meant to contrast with languages that distinguish statements from expressions, such as Python, Javascript, and C.

Favoring expressions over statements can simplify code. Instead of the bulky:

if increment: do_something_with(x + 1) else: do_something_with(x)

you can write:

do_something_with(x + if increment then 1 else 0)

However, this is not an essay about the benefits of replacing statements with expressions.

Rather, I claim that not everything is an expression! The idea that there are only expressions ignores the importance of other syntax classes. I believe Lisp macros are underpowered because of this ignorance.

1 What are syntax classes?

I will explain what I mean by syntax class in two ways, first with an abstract definition, then with an example. Skip the first if it's confusing; skip the second if you don't need it.

Syntax classes, abstractly

By syntax class I mean a distinct type in our language's abstract syntax tree, or a semantically interesting nonterminal symbol in our language's grammar. In most grammars, many nonterminals are semantically boring, and exist to specify banal syntactic details — operator precedence, where commas go in a list, the format of numeric literals. But some nonterminals represent things like expressions, statements, declarations, modules, patterns or variables; the "top-level" categories in a broad overview of the language's syntax. These are syntax classes.1

Syntax classes, by example

def twice_last( seq ): last, *rest = reversed(seq) return last * 2

This (rather unidiomatic) Python code is colored by syntax class: expressions in red, statements in blue, declarations in purple, and patterns in green.

Different syntax classes represent distinct concepts:

Expressions are evaluated to compute their value. Literal values, variables, binary operations, and function calls are examples of expressions.

Statements are executed to take some action. Variable assignments, loops, conditionals, and raising exceptions are examples of statements.

Declarations alter our environment, often by defining something. Function and type definitions, imports, and variable declarations (in a language that distinguishes them from assignments, such as C or Javascript) are declarations.

Patterns tell us how to bind variables. They get matched against values. For example, in Python: x,y = (1,2) This matches the pattern x,y against the value (1,2) . The match succeeds, binding x to 1 and y to 2 . If instead we wrote: x,y = (1,2,3) Then the match would fail, since x,y expects a 2-element sequence, and (1,2,3) has 3 elements. This raises a ValueError . Even an ordinary variable assignment like x = 3 involves a pattern: the varia­ble x . x is simply a pattern that always matches, binding x to whatever it is matched against.

This is not an exhaustive list. These classes are fuzzy. Different languages treat them differently. Many languages collapse two or more into one category. Lisps don't distinguish between statements and expressions; Python combines statements and declarations; Haskell lacks statements entirely. Some languages (Prolog, Forth, sed, SQL, BNF) aren't easy to describe in these terms at all!

2 The power of patterns

Most mainstream languages only have very simple patterns. But in some languages, particularly those descended from the ML family, patterns are quite powerful and pattern-matching is a core construct. Some examples, in Haskell OCaml Standard ML Rust Racket :

fibonacci 0 = 1 fibonacci 1 = 1 fibonacci n = fibonacci (n-1) + fibonacci (n-2) sum [] = 0 sum (x:xs) = x + sum xs adjacent (x:y:rest) = (x,y) : adjacent (y:rest) adjacent _ = [] data Tree = Empty | Node Int Tree Tree inTree x Empty = False inTree x (Node v l r) = case compare x v of EQ -> True LT -> inTree x l GT -> inTree x r

fun fibonacci 0 = 1 | fibonacci 1 = 1 | fibonacci n = fibonacci (n-1) + fibonacci (n-2) fun sum [] = 0 | sum (x::xs) = x + sum xs fun adjacent (x::y::rest) = (x,y) :: adjacent (y::rest) | adjacent _ = [] datatype tree = Empty | Node of int * tree * tree fun inTree x Empty = false | inTree x (Node(v,l,r)) = case Int.compare (x,v) of EQUAL => true | LESS => inTree x l | GREATER => inTree x r

let rec fibonacci = function | 0 -> 1 | 1 -> 1 | n -> fibonacci (n-1) + fibonacci (n-2) let rec sum = function | [] -> 0 | (x::xs) -> x + sum xs let rec adjacent = function | (x::y::rest) -> (x,y) :: adjacent (y::rest) | _ -> [] type tree = Empty | Node of int * tree * tree let rec inTree x t = match t with | Empty -> false | Node(v,l,r) -> x == v || inTree x ( if x < v then l else r)

fn fib( n : int ) -> int { match n { 0 => 1 , 1 => 1 , n => fib(n-1) + fib(n-2) , } } fn sum( l : &[int] ) -> int { match l { [] => 0 , [x, ..xs] => x + sum(xs) , } } fn adjacent( l : &[int] ) -> Vec<(int,int)> { match l { [x, .. xs @ [y, ..]] => adjacent(xs) + &[(x, y)], _ => vec![] } } enum Tree { Empty, Node( int , Box<Tree> , Box<Tree> ) } fn inTree( x : int , t : &Tree ) -> bool { match t { &Empty => false, &Node(v, ref l, ref r) => match x.cmp(&v) { Equal => true, Less => inTree(x, *l), Greater => inTree(x, *r), } } }

( define/match (fibonacci n ) [( (or 0 1) ) 1 ] [( n ) (+ (fibonacci (- n 1)) (fibonacci (- n 2))) ]) ( define/match (sum l ) [( '() ) 0 ] [( (cons x xs) ) (+ 1 (sum xs)) ]) ;; Finds adjacent pairs: ;; (adjacent '(a b c)) == '((a b) (b c)) ( define/match (adjacent l ) [( (cons x (and xs (cons y _))) ) (cons (list x y) (adjacent xs)) ] [( _ ) '() ]) ( struct node (value left right) #:transparent) ( define/match (in-tree x tree ) [( x (node v l r) ) ( or (= x v) (in-tree x ( if (< x v) l r))) ] [( _ _ ) #f ])

Many language features are simply special-cases of pattern-matching. We've already seen that Python's tuple unpacking is a form of pattern-matching. Optional arguments and rest arguments ( *args in Python & Ruby) are patterns matched against a list of arguments. Keyword arguments are patterns matched against a dictionary of arguments. if is just pattern-matching on booleans:

if CONDITION { THEN-CASE ... } else { ELSE-CASE ... }

Is the same as:

case CONDITION of True -> THEN-CASE ... False -> ELSE-CASE ...

Many useful domain-specific languages are based on pattern-matching. Regular expressions express patterns of strings; sed and awk employ this to great effect. yacc uses a more powerful cousin of this idea, extending patterns to grammars. The OMeta language attempts to push the power of pattern-matching as far as possible in a general-purpose setting.

3 Pattern-matching in Lisps

Pattern-matching is not native to Lisp as it is to the ML family. Neither Common Lisp nor Scheme nor Clojure have a general-purpose pattern-matching form built-in, much less integrated into function definition.2

But adding language features is Lisp's strong point: we can implement pattern-matching as a macro!3 As a toy example, here's an if-match macro, where (if-match pat expr then else) runs then if pat matches expr and otherwise runs else . In Clojure Common Lisp Scheme Racket :

( defmacro if-match [ pat expr then else ] ( cond (symbol? pat) ` ( let [~ pat ~ expr ] ~ then ) (not (seq? pat)) ` ( if (= '~ pat ~ expr ) ~ then ~ else ) (= ' cons (first pat)) (let [[ _ p-first p-rest ] pat tmp (gensym)] ` ( let [~ tmp ~ expr ] ( if (not (seq? ~ tmp )) ~ else (if-match ~ p-first (first ~ tmp ) (if-match ~ p-rest (rest ~ tmp ) ~ then ~ else ) ~ else ))) ) :else ( throw (Exception. "Unrecognized pattern" ))))

( defmacro if-match ( pat expr then else ) ( cond (( and pat (symbolp pat)) ` ( let ((, pat , expr )) , then ) ) ((atom pat) ` ( if (equalp ', pat , expr ) , then , else ) ) ((eq ' cons (car pat)) ( destructuring-bind ( _ p-car p-cdr ) pat (let (( tmp (gensym))) ` ( let ((, tmp , expr )) ( if (not (consp , tmp )) , else (if-match , p-car (car , tmp ) (if-match , p-cdr (cdr , tmp ) , then , else ) , else ))) ))) (t (error "Unrecognized pattern" ))))

( define-syntax if-match ( syntax-rules ( cons ) [ (_ (cons p-car p-cdr) expr then else) ( let (( x expr)) ( if (not (pair? x)) else (if-match p-car (car x) (if-match p-cdr (cdr x) then else) else)))] [ (_ pat expr then else) (symbol?? pat ;; deep magic ( let (( pat expr)) then) ( if (equal? ' pat expr) then else))]))

( require syntax/parse/define) ( define-syntax-parser if-match #:literals ( cons ) [ (_ x:id expr then else) #' ( let ((x expr)) then) ] [ (_ (cons p-car p-cdr) expr then else) #' ( let ((x expr)) ( if (not (pair? x)) else (if-match p-car (car x) (if-match p-cdr (cdr x) then else) else))) ] [ (_ (~and lit (~not (_ ...))) expr then else) #' ( if (equal? 'lit expr) then else) ])

(Of course, in Racket you could just use match, but now we're getting ahead of ourselves.)

We can use it like so:

( if-match 0 0 true false ) ; --> true ( if-match 0 1 true false ) ; --> false ( if-match (cons 0 x) ' (0 hi there) x nil ) ; --> (hi there) ( if-match (cons 0 x) ' (1 hi there) x nil ) ; --> nil

( if-match 0 0 t nil ) ; --> T ( if-match 0 1 t nil ) ; --> NIL ( if-match (cons 0 x) ' (0 hi there) x nil ) ; --> (HI THERE) ( if-match (cons 0 x) ' (1 hi there) x nil ) ; --> NIL

( if-match 0 0 #t #f ) ; --> #t ( if-match 0 1 #t #f ) ; --> #t ( if-match (cons 0 x) ' (0 hi there) x #f ) ; --> (hi there) ( if-match (cons 0 x) ' (1 hi there) x #f ) ; --> #f

So far, so good, right? But...

The Library Problem

Now that I've defined my handy-dandy pattern-matching macro if-match , I'm stuck with whatever patterns I thought up when I wrote it. If I come up with a new pattern, I have to rewrite if-match to support it.

This sucks! Consider: Hilary the Hacker writes a spiffy lazy stream library, while Charlie the Coder writes a fast B-tree library. Naturally, both want to support pattern-matching. They can either:

Each implement their own custom pattern-matching macros. This duplicates work, defeats standardization, and makes it impossible to pattern-match on a stream and a tree at the same time. Bug me to include support for Hilary's streams and Charlie's trees in my if-match library. This introduces irrelevant dependencies, and over time, a big ugly monolithic ball of code will accrete.4

The whole meaning of modularity is avoiding this rock-and-a-hard-place choice between duplicating work and centralizing it.

Generalized macros

The problem with if-match is that it requires you to think of everything in advance — exactly the difficulty macros are supposed to solve! Macros are Lisp's standard lan­guage ext­ension mechanism, and patterns are just a small, domain-specific lan­guage.

So why can't we just define new patterns with macros?

Here's how this might work. Suppose we already know how to match the following patterns:

nil Matches an empty list. (cons A B) Matches a list whose first element matches A

and whose remainder matches B .

It would be handy to have a pattern (list A 1 ... A n ) , which matches an n-element list whose 1st element matches A 1 , whose 2nd element matches A 2 , ... and so on up to A n . For example, (list x 2 y) would match (1 2 3) , binding x to 1 and y to 3 .

I'm proposing that this be done using a macro for patterns, with something like the following mock- Clojure Lisp Scheme syntax:

( defpattern list [& element-pats ] ( if (empty? element-pats) ' nil ( let [[ p & ps ] element-pats] ` (cons ~ p (list ~@ ps )) )))

( defpattern list (&rest element-pats ) ( if (not element-pats) ' nil ( destructuring-bind ( p &rest ps ) element-pats ` (cons , p (list ,@ ps )) )))

( define-pattern-syntax list ( syntax-rules () [ (_) nil] [ (_ p ps ...) (cons p (list ps ...))]))

This solves Hilary and Charlie's library problem: instead of rolling their own incompatible pattern-matchers, or asking me to rewrite if-match , they separately extend if-match using defpattern define-pattern-syntax . These extensions are pulled into scope when I import their libraries, just as a macro would be.5

This idea is not original to me: The Racket language's match macro is extensible via define-match-expander . The optima library for Common Lisp exposes a defpattern macro as well. I don't know of a Clojure library that uses this approach per se, but core.match is extensible in other ways.

Ultimately, this is a partial success story for Lisp. ML-style pattern-matching can be assimilated and macro-ified. With some work, it can even be made macro-extensible. This design is not yet idiomatic among Lisps, but hopefully in time it will be. And in any language but a Lisp, would this kind of extension even be thinkable?6

4 DSLs as syntax classes

Hold on — isn't this an essay about syntax classes, and how not everything is an expression? What does the design of pattern-matching macros have to do with that? Let's rephrase our progress so far in terms of syntax classes:

Pattern-matching doesn't just add a new kind of expression, it adds a new syntax class: patterns. The problem with if-match is that it doesn't allow modularly extending the syntax of patterns. We already know how to modularly extend expression syntax: macros! By extending macros to cover patterns as well, we solve the library problem.

Now that we're thinking in terms of syntax classes, the problem and solution have become much clearer!

But there's nothing special about patterns. Many domain-specific languages are simply new syntax classes on top of an existing language. Microsoft's LINQ adds a syntax class for queries, for example. Expressions embed queries to be performed, and queries embed expressions representing (e.g.) predicates to be filtered on or keys to be looked up. Parser-generators are another example: typically, we write parsing rules in a special syntax, like BNF-inspired yacc rules or PEG parsing expressions. Into these rules we embed semantic actions — expressions that run when a parse rule matches.

The canonical uses of Lisp macros are: 1. changing order of evaluation, 2. creating new binding or definition forms, and 3. making domain-specific languages. LINQ and parsing are clear-cut examples of case 3. But the obvious way to implement a DSL as a macro, as we saw with if-match , hard-codes the form of the new syntax class, giving up on extensibility. Because of this, I think Lisp macros aren't good enough at making domain-specific languages.

Consider: Each of the pattern-matching libraries I listed above had to design and implement a extension mechanism from scratch — even though Lisp already has an extension mechanism: macros! And patterns are just one of many useful domain-specific languages we might desire. Introducing a new extensible syntax class shouldn't require us to reinvent the wheel! Lisp doesn't have a standard way to define new syntax classes, one which makes them macro-extensible by default. It should.7

This essay first appeared in Code Words. Thanks to Jari Takkala and Aki Yamada for editing this article.

Footnotes