Lisp Style and Efficiency

Introduction

Lisp is somewhat like the restaurant just described. The Lisp manual presents a long list of system functions, but without any performance information. The Lisp programmer who is inexperienced with efficiency issues may choose ways of implementing a program that sound reasonable, even elegant, but are breathtakingly expensive. This individual may then abandon the use of Lisp, condemning it as elegant perhaps, but too inefficient for ``real-world'' use. The author has often heard, ``I like Lisp, but I'm writing my application in C for efficiency.''

The problem is compounded by the fact that many of the Lisp functions needed for writing more efficient code are not taught in beginning Lisp courses, or are labelled as ``dangerous'' or in poor taste, or all of the above.

The goals of this short introductory exercise are to present some basic rules about efficiency of programs and to present some standard Lisp idioms that are used in writing efficient code. Readability and elegance of structure are worth more than ``saving a few microseconds'' in many cases. On the other hand, coding style that increases the computational complexity of a program can make it impossible to solve the desired problem, or make the program so slow that it is not usable in practice. For motivation, we have provided some small functions, written with only minor differences in the ways they are coded, but with striking differences in performance. After reviewing this chapter, we hope the student will be better able to choose from the long wine list presented by Lisp.

Basic Concepts of Program Performance

In general, the majority of the time required for execution of a program is spent in loops. We can therefore state several rules:

Computations that are performed individually (not in a loop) do not affect performance very much. Use any implementation you like for these; strive for elegance and clarity.

As a rule of thumb, 90% of the performance problems will occur in 10% of the code. This 10% of the code is located inside loops.

Nested loops (loops within loops) are the most important ones to optimize.

Loops may be separated in code but nested in execution. If one program calls a subprogram within a loop, and the subprogram contains a loop, it is a nested double loop.

Some Lisp system functions contain loops; these should be used with care.

Computational Complexity of an Algorithm

Importance of Computational Complexity

A complaint that is often heard about A.I. programs is that ``they don't scale''. That is, a program runs fine on a ``toy'' test case, but fails to run on the big case that is really of interest. Such complaints are often justified. What they mean, in many cases, is that the complexity of the computation is high; this can cause the behavior of working on small data sets but not on large ones. To write successful A.I. programs in Lisp, it is necessary to know how to avoid increasing the complexity of a computation unnecessarily; fortunately, this is something that can easily be learned.

The Bermuda Triangle

Suppose that there is a program loop that is repeated n times; within the loop is a computation that grows linearly with each pass through the loop. What is the order of the resulting computation? Consider the following functions that make a list of n integers from 0 to n - 1 ; copies of these functions can be found in the file bermuda.lsp .

(defun list-of-nums1 (n) (let (l) (dotimes (i n) (setq l (append l (list i)))) l )) (defun list-of-nums2 (n) (let (l) (dotimes (i n) (setq l (nconc l (list i)))) l ))

(defun list-of-nums3 (n) (let (l) (dotimes (i n) (setq l (cons i l))) (nreverse l)))

l

nconc

append

cons

nreverse

bermuda.lsp

n = 10

(list-of-nums1 10) = (0 1 2 3 4 5 6 7 8 9) (list-of-nums2 10) = (0 1 2 3 4 5 6 7 8 9) (list-of-nums3 10) = (0 1 2 3 4 5 6 7 8 9)

n

car

5000

n

(time (car (list-of-nums1 5000))) = 0 Run time: 319.20 s (time (car (list-of-nums2 5000))) = 0 Run time: 18.76 s (time (car (list-of-nums3 5000))) = 0 Run time: 0.14 s

list-of-nums1

list-of-nums3

n

5000

n

n

(time (car (list-of-nums3 100000))) = 0 Run time: 2.46 s

list-of-nums3

n

list-of-nums3

list-of-nums1

100000

n

list-of-nums1

If we can get into this much trouble with a four-line function, imagine what could happen in a large A.I. application!

Now let's analyze the Bermuda Triangle and see what is happening. list-of-nums1 uses append , which appends lists to form a single list containing all of their elements. append is a safe function in the sense that it can never mess up any existing data structures; it achieves this safety by copying all of its arguments except the last, as illustrated in the following recursive version:

(defun append (a b) (if a (cons (car a) (append (cdr a) b)) b))

append

'(A B)

'(C)

(A B)

(C)

append

cons

append

cons

list-of-nums1

i l := (append l (list i)) 0 () + (0) 1 (0) + (1) 2 (0 1) + (2) 3 (0 1 2) + (3) 4 (0 1 2 3) + (4) 5 (0 1 2 3 4) + (5) 6 (0 1 2 3 4 5) + (6) 7 (0 1 2 3 4 5 6) + (7) 8 (0 1 2 3 4 5 6 7) + (8) the whole triangle 9 (0 1 2 3 4 5 6 7 8) ---------> becomes garbage | | | copied | + V V (0 1 2 3 4 5 6 7 8)---->(9)

append

l

i

append

l

l

list-of-nums1

n

n - 1

n*(n-1)/2

list-of-nums1

n/2

list-of-nums3

cons

What about list-of-nums2 , which uses nconc instead of append ? (See Winston & Horn, Chapter 17 for a discussion of nconc .) Its performance is better than list-of-nums1 , but still much worse than list-of-nums3 . What is its computational complexity? Let's make a diagram for list-of-nums2 :

i l := (nconc l (list i)) 0 () + (0) 1 (0) + (1) 2 (0 1) + (2) 3 (0 1 2) + (3) 4 (0 1 2 3) + (4) 5 (0 1 2 3 4) + (5) 6 (0 1 2 3 4 5) + (6) 7 (0 1 2 3 4 5 6) + (7) 8 (0 1 2 3 4 5 6 7) + (8) 9 (0 1 2 3 4 5 6 7 8) start walk to end, | and rplacd here, V + ---->(0 1 2 3 4 5 6 7 8)----->(9)

nconc

cons

list-of-nums2

nconc

list-of-nums2

list-of-nums1

If a loop involves a computation that grows each time through the loop, the loop has order O(n^2) .

An O(n^2) algorithm becomes intolerable even when n doesn't seem all that large.

Finally, let's examine the diagram for list-of-nums3 :

i l := (cons i l) 0 0 + () 1 1 + (0) 2 2 + (1 0) 3 3 + (2 1 0) 4 4 + (3 2 1 0) 5 5 + (4 3 2 1 0) 6 6 + (5 4 3 2 1 0) 7 7 + (6 5 4 3 2 1 0) 8 8 + (7 6 5 4 3 2 1 0) 9 9 + (8 7 6 5 4 3 2 1 0)

cons

nreverse

nreverse

A succession of a fixed number of O(n) algorithms is still O(n) .

As n becomes large, it is much better to have a ``slower'' O(n) algorithm than a ``faster'' O(n^2) algorithm.

CONS is Expensive

cons

cons

cons

cons

cons

Therefore, the ``eleventh commandment'' for Lisp programmers is:

Thou shalt not cons in vain.

cons

cons

Backquote Does CONSes

cons

cons

Two conses: No conses: (equal x (list '- y)) (and (eq (car x) '-) (equal x `(- ,y)) (equal (cadr x) y))

list

Functions that Copy

cons

append

reverse

subst

copy-tree

Many of the functions that copy list structure have counterparts that are ``destructive'', that is, they modify existing structure rather than making new structure. The motivation for using the ``destructive'' functions, of course, is that they do no cons es. Often the ``destructive'' versions have names that begin with n , such as nreverse , nconc , and nsubst .

Beginning Lisp courses often label these functions ``dangerous'' and discourage their use. However, there is a simple condition under which it is always safe to use these functions:

If there is only one pointer to a structure, it is always safe to use a destructive function on it.

(defun list-of-nums3 (n) (let (l) (dotimes (i n) (setq l (cons i l))) (nreverse l)))

l

let

l

nreverse

cons

nreverse

Generate the Desired Result Directly

NIL

NIL

cons

mapcan is a mapping function designed for filtering a list; it maps over a given list, making a list of the results for each element concatenated as if nconc were used. Suppose that we want to make a list of things that are pretty :

Original: Better: (mapcar (mapcan #'(lambda (x) #'(lambda (x) (if (pretty x) x)) (if (pretty x) (list x))) lst) lst)

mapcar

pretty

NIL

pretty

cons

mapcan

pretty

mapcan

If you don't want the output to contain anything at all for an item, produce NIL as the output of the function used with mapcan .

as the output of the function used with . If you do want the output to contain something for an item, produce (list ) .

. If you want the output to contain multiple results for an item, produce (list result1 . . . resultn) . The output of mapcan is equivalent to the result that would be obtained from applying nconc to the individual result lists.

The second useful idiom, used when one needs to process a tree structure rather than a linear list, requires a bit more thought. mapcan is used mainly for filtering linear lists; it could, however, be used in a recursive function that traverses a tree. We will use as an example the task of computing the ``fringe'' of a tree, that is, making a list of the atoms at its terminal nodes in order; think of picking the apples from a tree while leaving the branches behind. For example,

(fringe '(((a) b ((c d (e)) f) (g)) h)) = (A B C D E F G H)

mapcan

(defun mfringe (tree) (mapcan #'(lambda (x) (if (atom x) (list x) (mfringe x))) tree))

mfringe

list-of-nums2

list-of-nums3

(defun fringe (tree) (nreverse (fringeb tree nil))) (defun fringeb (tree basket) (if tree (if (atom tree) (cons tree basket) (fringeb (cdr tree) (fringeb (car tree) basket))) basket))

tree

car

cdr

basket

fringeb

basket

fringeb

cons

basket

nreverse

fringe

if

fringeb

tree

nil

basket

tree

nil

fringeb

tree

atom

cons

tree

cons

basket

car

tree

cdr

fringeb

car

tree

basket

car

tree

cons

basket

cdr

tree

fringeb

cdr

fringeb

car

The use of an ``extra'' function argument to hold a result that is being constructed is a frequently used idiom that helps in writing efficient Lisp code.

Use Control to Avoid Materializing Sets

For example, suppose we wanted to print the names of things that are red, fast, and expensive from a catalog. We could write various set filters using mapcan :

(defun red-ones (l) (mapcan #'(lambda (item) (if (eq (color item) 'red) (list item))) l))

(mapc #'(lambda (item) (print (name item))) (expensive-ones (fast-ones (red-ones catalog))) )

(mapc #'(lambda (item) (if (and (fast item) (red item) (expensive item)) (print (name item)))) catalog)

cons

The example above is a simple one, since it involves filtering a single set. The same principle also applies to more complicated examples:

Use recursive control to process items one at a time, rather than materializing sets of items to be processed.

cons

Lisp Style

Our suggestions are largely based on experience with student programs and are intended to help students acquire better Lisp style. Many of the comments are also motivated by concern for efficiency.

Simplify Code

Original: Better: (cond ((not (okay x)) nil) (if (okay x) (fn x)) (t (fn x)))

if

NIL

(okay x)

if

Think Recursively

Original: Better: (if (> grade 70) (print (if (> grade 70) (print 'ok) 'ok (print 'failing)) 'failing))

Original: Better: (setq price (incf total (get item 'price)) (* (get item 'price) (setq subtotal quantity)) (* price quantity)) (setq total (+ total subtotal))

push

incf

decf

Use Symbolic Data

Original: Better: (cond ((eq word 'one) 1) (or (get word 'numval) ((eq word 'two) 2) 'infinity) ... ((eq word 'nine) 9) (t 'infinity))

Use Canonical Forms

(+ lhs rhs)

lhs

+

Original: Better: (defun splus (lhs rhs) (defun splus (lhs rhs) (cond ((numberp lhs) (cond ((numberp lhs) Lots of Code ...) Lots of Code ...) ((numberp rhs) ((numberp rhs) Lots of Code ...) (splus rhs lhs))

splus

References