The Clojure programming language

Take advantage of the Clojure plug-in for Eclipse

This article covers the Clojure programming language. Clojure is a Lisp dialect. It is assumed that you do not already know Lisp. Instead, it is assumed that you have knowledge of Java technology. To write Clojure programs, you need a Java Development Kit V5 or higher and the Clojure library. For this article, JDK V1.6.0_13 and Clojure V1 were used. You should also take advantage of the Clojure plug-in for Eclipse (clojure-dev), and you will need Eclipse for that. For this article, Eclipse V3.5 was used, along with clojure-dev 0.0.34. See Related topics for links.

What is Clojure?

It was not that long ago when running your programs on the Java Virtual Machine (JVM) meant writing your program using the Java programming language. Those days are long gone because now you have many choices. Many popular choices, such as Groovy, Ruby (via JRuby), and Python (via Jython), allow for a more procedural, scripting style of programming, or they have their own flavor of object-oriented programming. These are both paradigms familiar to Java programmers. One could argue that with these languages, you write programs similar to what you would write in the Java language; you just get to use a different syntax.

Clojure is yet another programming language for the JVM. However, it is quite different from Java technology or any of the other JVM languages mentioned. It is a dialect of Lisp. The Lisp family of programming languages have been around a long time — since the 1950s, in fact. Lisp uses the distinct S-expressions or prefix notation. This notation can be summarized as (function arguments...) . You always start with the name of a function, and list zero or more arguments to pass in to that function. The function and its arguments are organized together by surrounding them with parentheses. This leads to one of the trademarks of Lisp: a lot of parentheses.

As you might guess, Clojure is a functional programming language. Academics can debate its "purity," but it definitely embraces the pillars of functional programming: avoid mutable state, recursion, higher-order functions, etc. Clojure is also a dynamically typed language, though you can optionally add type information to improve performance for critical paths in your code. Clojure not only runs on the JVM but is designed with Java interoperability in mind. Finally, Clojure is a language designed with concurrency in mind and has some unique features related to concurrent programming.

Clojure by example

For many, the best way to learn a new language is to start writing code. In this spirit, we will take some simple programming problems and solve them using Clojure. We will go through the solutions in detail to gain a better understanding of how Clojure works, how you can use it, and what kind of things it does well. However, like any other language, we need to set up a development environment for working with it. Luckily, this is pretty easy with Clojure.

Minimal setup

All you need for working with Clojure is a JDK and the Clojure library, which is a single JAR file. There are two common ways to develop and run Clojure programs. The most common is using its read-eval-print-loop (REPL).

Listing 1. The Clojure REPL

$ java -cp clojure-1.0.0.jar clojure.lang.Repl Clojure 1.0.0- user=>

The command was run from the directory where the Clojure JAR was located. Adjust the path to the JAR as needed. You can also create a script and execute the script. To do this, you need to execute a Java class called clojure.main.

Listing 2. Clojure main

$ java -cp clojure-1.0.0.jar clojure.main /some/path/to/Euler1.clj 233168

Again, you need to adjust the path to your Clojure JAR and your scripts. Finally, there is IDE support for Clojure. Eclipse users can install the clojure-dev plug-in using its Eclipse update site. Once it is installed, make sure you are in the Java perspective, then you can create a new Clojure project and new Clojure files, as shown below.

Figure 1. Using clojure-dev, the Clojure plug-in for Eclipse

With clojure-dev, you get some basic syntax highlighting, including parentheses matching (a must-have for any Lisp). You can also launch any script in an REPL that is embedded directly in Eclipse. The plug-in is still very new, as of the writing of this article, and its features are moving forward rapidly. Now that we have the basic setup out of the way, let's explore the language by writing some Clojure programs.

Example 1: Working with sequences

The name Lisp comes from "list processing," and it is often said that everything in Lisp is a list. In Clojure, this is generalized as sequences. For the first example, we will take the following programming problem.

If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6, and 9. The sum of these multiples is 23. Find the sum of all the multiples of 3 or 5 below 1,000.

This problem is taken from Project Euler, a collection of mathematical problems that can be solved using clever (or sometimes not-so-clever) computer programming. In fact it is Problem No. 1. Listing 3 shows a solution to it using Clojure.

Listing 3. Example 1 from Project Euler

(defn divisible-by-3-or-5? [num] (or (== (mod num 3) 0)(== (mod num 5) 0))) (println (reduce + (filter divisible-by-3-or-5? (range 1000))))

The first line defines a function. Remember: Functions are the primary building blocks of programs in Clojure. Most Java programmers are used to objects being the building blocks of their programs, so using functions can take some getting used to. You might think that defn is a keyword of the language, but it is actually a macro. A macro allows you to extend the Clojure compiler to essentially add new keywords to the language. Thus, defn is not part of the language specification but it is added by the language's core library.

In this case, it is creating a function called divisible-by-3-or-5? . This follows Clojure naming conventions. Words are separated by hyphens, and the function's name ends with a question mark, indicating it is a predicate in that it returns true or false. The function takes a single parameter named num . If there were more input parameters, they would appear inside the square brackets, separated by spaces.

Next comes the body of the function. First, we call the or function. This is the normal logical or are used to; it's just a function, not an operator. We pass it to parameters. Each of these are also expressions. The first expression starts with the == function. This does a value-based comparison of the parameters passed to it. There are two parameters passed to it. The first is another expression; this expression calls the mod function. This is the modulo operator from mathematics, or the % operator in the Java language. It returns the remainder, so in this case, the remainder when num is divided by 3. That remainder is compared to 0 (it is the remainder 0 and, thus, num is divisible by 3). Similarly, we check to see what is the remainder when num is divided by 5 is 0. If either of these remainders is 0, the function returns true.

On the next line, we are creating an expression and printing it out. Let's start from the innermost set of parentheses. Here, we call the range function and pass in the number 1,000. This creates a sequence, starting with 0, of all numbers less than 1,000. This is exactly the set of numbers we want to check to see if they are divisible by 3 or 5. Moving out, we call the filter function. This takes two parameters: The first is another function that must be a predicate in that it must return true or false; the second parameter is a sequence — in this case, the sequence (0, 1, 2, ... 999) . The filter function applies the predicate, and if it returns true, the element in the sequence is added to the result. The predicate is just the divisible-by-3-or-5? function defined on the line above.

So the filter expression will result in a sequence of integers where each is less than 1,000 and divisible by 3 or 5. This is exactly the set of integers we are interested in, so now we just need to add them. To do this, we use the reduce function. This function takes two parameters: a function and a sequence. It applies the function to the first two elements in the sequence. Then it applies the function to the previous result and the next element in the sequence. In this case, the function is the + function, or addition. Thus, it will add all of the elements in the sequence.

Taking a look at Listing 3, a lot happens in a small amount of code. That is one of the appeals of Clojure. A lot happens, but yet once you get used to the notation, the code is self-explanatory. Certainly, it would take a lot more Java code to do the same thing. Let's move on to another example.

Example 2: Laziness is a virtue

For this example, we will take a look at recursion and at laziness in Clojure. This is another concept new to many Java programmers. Clojure lets you define sequences that are "lazy" because their elements are not calculated until they are needed. This allows you to define infinite sequences, and you definitely do not see those in the Java language. To see an example of when this is especially useful, let's take a look at an example that involves another important aspect of functional programming: recursion. Once again, we use a programming problem from Project Euler, but this time it's Problem No. 2.

Each new term in the Fibonacci sequence is generated by adding the previous two terms. By starting with 1 and 2, the first 10 terms will be: 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, ...

Find the sum of all the even-valued terms in the sequence that do not exceed 4 million. To solve this problem, a Java programmer might be tempted to define a function that gives you the nth Fibonacci number. A naive implementation of this is shown below.

Listing 4. A naive Fibonacci function

(defn fib [n] (if (= n 0) 0 (if (= n 1) 1 (+ (fib (- n 1)) (fib (- n 2))))))

This checks if n is 0; if so, it returns 0. Then it checks if n is 1. If so, it returns 1. Otherwise, it calculates the (n-1)th Fibonacci number and the (n-2)th Fibonacci number and adds them together. This is certainly correct, but if you have done much Java programming, you see the problem. A recursive definition like this is going to fill up the stack rapidly and lead to a stack overflow. The Fibonacci numbers form an infinite sequence, so it should be described as such using Clojure's infinite lazy sequences. This is shown in Listing 5. Note that although Clojure has a more efficient Fibonacci implementation that is part of the standard library (clojure-contrib), it is more complex, so the Fibonacci sequence shown here comes from Stuart Halloway's book (see Related topics for more information).

Listing 5. A lazy sequence for the Fibonacci numbers

(defn lazy-seq-fibo ([] (concat [0 1] (lazy-seq-fibo 0 1))) ([a b] (let [n (+ a b)] (lazy-seq (cons n (lazy-seq-fibo b n))))))

In Listing 5, the lazy-seq-fibo function has two definitions. The first definition has no arguments, hence the empty square brackets. The second definition takes two arguments [a b] . For the no-arguments case, we take the sequence [0 1] and concatenate it to an expression. That expression is a recursive call to lazy-seq-fibo , but this time, it is calling the two argument case, passing in 0 and 1 to it.

The two-argument case starts off with a let expression. This is variable assignment in Clojure. The expression [n (+ a b)] is defining a variable n and setting it equal to a+b . It is then using the lazy-seq macro. As the name suggests, the lazy-seq macro is used to create a lazy sequence. Its body is an expression. In this case, it's using the cons function. This is a classic function in Lisp. It takes in an element and a sequence and returns a new sequence by prepending the element to the sequence. In this case, the sequence is the result of again calling the lazy-seq-fibo function. If this sequence was not lazy, the lazy-seq-fibo function would get called again and again. However, the lazy-seq macro ensures that the function will only be invoked as the elements are accessed. To see this sequence in action, you can use the REPL, as shown in Listing 6.

Listing 6. Generating Fibonacci numbers

1:1 user=> (defn lazy-seq-fibo ([] (concat [0 1] (lazy-seq-fibo 0 1))) ([a b] (let [n (+ a b)] (lazy-seq (cons n (lazy-seq-fibo b n)))))) #'user/lazy-seq-fibo 1:8 user=> (take 10 (lazy-seq-fibo)) (0 1 1 2 3 5 8 13 21 34)

The take function is used to take a certain number (in this case, 10) of elements from a sequence. Now that we have a good way to generate Fibonacci numbers, let's solve the problem.

Listing 7. Example 2

(defn less-than-four-million? [n] (< n 4000000)) (println (reduce + (filter even? (take-while less-than-four-million? (lazy-seq-fibo)))))

In Listing 7, we define a function called less-than-four-million? . This simply tests if its input is less than 4 million. In the next expression, it is useful to start in the innermost expression. We first get the infinite Fibonacci sequence. We then use the take-while function. This is like the take function, but it takes a predicate. Once the predicate returns false, it stops taking from the sequence. So in this case, as soon as we get a Fibonacci number greater than 4 million, we stop taking. We take this result and apply a filter. The filter uses the built-in even? function. This function does just what you would think: It tests if a number is even. The result is all of the Fibonacci numbers less than 4 million and even. Now we total them up using reduce , just as we did in the first example.

Listing 7 solves the problem at hand, but it is not completely satisfying. To use the take-while function, we had to define a very simple function called less-than-four-million? . It turns out that this is not necessary. It should come as no surprise that Clojure has support for closures. This can simplify code like that in Listing 8.

Closures in Clojure

Closures are common in many programming languages, especially in functional languages, such as Clojure. Not only are functions first-class citizens and can be passed as arguments to other functions but they can be defined inline or anonymously. Listing 8 shows a simplification of Listing 7, using a closure.

Listing 8. Simpler solution

(println (reduce + (filter even? (take-while (fn [n] (< n 4000000)) (lazy-seq-fibo)))))

In Listing 8, we have used the fn macro. This creates an anonymous function and returns it. Predicate functions are often very simple and better off being defined using a closure. As it turns out, Clojure has an even more-abbreviated way to define closures.

Listing 9. Shorthand closure

(println (reduce + (filter even? (take-while #(< % 4000000) (lazy-seq-fibo)))))

We have used # to create the closure instead of the fn macro. We have also used the % symbol for the first parameter passed to the function. You could also use %1 for the first parameter and similarly %2 , %3 , etc. if the function accepted multiple parameters.

With just these two simple examples, we have seen many features of Clojure. One other important aspect of Clojure is its tight integration with the Java language. Let's look at another example where leveraging Java from Clojure is helpful.

Example 3: Using Java technology

The Java platform has a lot to offer. The performance of JVM and the richness of both the core APIs and the numerous third-party libraries written in the Java language are all powerful tools that can save you from reinventing too many wheels. Clojure is built around these ideas. It is easy to call Java methods, create Java objects, implement Java interfaces, and extend Java classes. To see some examples of this, let's take a look at another Project Euler problem.

Listing 10. Problem No. 8 from Project Euler

Find the greatest product of five consecutive digits in the 1000-digit number. 73167176531330624919225119674426574742355349194934 96983520312774506326239578318016984801869478851843 85861560789112949495459501737958331952853208805511 12540698747158523863050715693290963295227443043557 66896648950445244523161731856403098711121722383113 62229893423380308135336276614282806444486645238749 30358907296290491560440772390713810515859307960866 70172427121883998797908792274921901699720888093776 65727333001053367881220235421809751254540594752243 52584907711670556013604839586446706324415722155397 53697817977846174064955149290862569321978468622482 83972241375657056057490261407972968652414535100474 82166370484403199890008895243450658541227588666881 16427171479924442928230863465674813919123162824586 17866458359124566529476545682848912883142607690042 24219022671055626321111109370544217506941658960408 07198403850962455444362981230987879927244284909188 84580156166097919133875499200524063689912560717606 05886116467109405077541002256983155200055935729725 71636269561882670428252483600823257530420752963450

In this problem, we have a 1,000-digit number. This could be represented numerically in Java technology using a BigInteger . However, we do not need to do computations on the entire number — only five digits at a time. Thus it is easier to treat it as a string. However, to make calculations, we need to treat the digits as integers. Luckily, there are APIs in the Java language for going back and forth between strings and integers. To start with, we need to deal with the large piece of unruly text from above.

Listing 11. Parsing the text

(def big-num-str (str "73167176531330624919225119674426574742355349194934 96983520312774506326239578318016984801869478851843 85861560789112949495459501737958331952853208805511 12540698747158523863050715693290963295227443043557 66896648950445244523161731856403098711121722383113 62229893423380308135336276614282806444486645238749 30358907296290491560440772390713810515859307960866 70172427121883998797908792274921901699720888093776 65727333001053367881220235421809751254540594752243 52584907711670556013604839586446706324415722155397 53697817977846174064955149290862569321978468622482 83972241375657056057490261407972968652414535100474 82166370484403199890008895243450658541227588666881 16427171479924442928230863465674813919123162824586 17866458359124566529476545682848912883142607690042 24219022671055626321111109370544217506941658960408 07198403850962455444362981230987879927244284909188 84580156166097919133875499200524063689912560717606 05886116467109405077541002256983155200055935729725 71636269561882670428252483600823257530420752963450"))

Here, we take advantage of Clojure's support for multi-line strings. We use the str function to parse the multi-line string literal. We then use the def macro to define a constant called big-num-str . However, what will be most useful to turn this into a sequence of integers. This is done in Listing 12.

Listing 12. Creating a numerical sequence

(def the-digits (map #(Integer. (str %)) (filter #(Character/isDigit %) (seq big-num-str))))

Again, let's start in the innermost expression. We use the seq function to turn big-num-str into a sequence. However, it turns out that this sequence is not exactly what we want. You can see this with help of the REPL, shown below.

Listing 13. Examining the big-num-str sequence

user=> (seq big-num-str) (\7 \3 \1 \6 \7 \1 \7 \6 \5 \3 \1 \3 \3 \0 \6 \2 \4 \9 \1 \9 \2 \2 \5 \1 \1 \9 \6 \7 \4 \4 \2 \6 \5 \7 \4 \7 \4 \2 \3 \5 \5 \3 \4 \9 \1 \9 \4 \9 \3 \4

ewline...

The REPL shows characters (a Java char) as \c . So \7 is the char 7, and

ewline is the char

(a new line). This is what we get for parsing the text directly. Clearly, we need to get rid of the newlines and covert to integers before we can do any useful calculations. This is what we do in Listing 11. There we use a filter to remove the newlines. Notice that once again, we used a shorthand closure for the predicate function passed to the filter function. The closure is using Character/isDigit . This is the static method isDigit from java.lang.Character . Thus, the filter only allows in chars that are numeric digits, discarding the newline characters.

Now we have gotten rid of the newlines, so we need to convert to integers. Moving inside-out in Listing 12, notice that we use the map function, which takes two parameters: a function and a sequence. It returns a new sequence where the nth element of the sequence is the result of applying the function to the nth element of the original sequence. For the function, we are once again using the shorthand closure notation. First we use the str function from Clojure to convert the char to a string. Why do we do this? Because next, we create an integer using the constructor for java.lang.Integer . This is denoted by Integer . You could think of this expression as new java.lang.Integer(str(%)) . Using this with the map function, we get a sequence of integers, just as we wanted. Now we can solve the problem.

Listing 14. Example 3

(println (apply max (map #(reduce * %) (for [idx (range (count the-digits))] (take 5 (drop idx the-digits))))))

To understand this piece of code, let's start with the for macro. This is not like a for loop in the Java language. Instead, it is a sequence comprehension. First, we create a binding using the square brackets. In this case, we are binding the variable idx to a sequence from 0 ... N-1 where N is the number of elements in the sequence the-digits , (N = 1,000, as the original number had 1,000 digits). Next, the for macro takes an expression it uses to generate a new sequence. It will iterate over each element of the idx sequence, evaluate the expression, and add the result to the return sequence. You can see how in some ways this does act kind of like a for loop. The expression used in the comprehension will first use the drop function to drop the first M elements of the sequence, then use the take function to take the first five elements of the shortened sequence. Remember that M will be 0, then 1, then 2, etc., so the result will be a sequence of sequences, where the first element will be (e1, e2, e3, e4, e5), the next element will be (e2, e3, e4, e5, e6), etc., where e1, e2, etc. are the elements from the-digits .

Now that we have this sequence of sequences, we use the map function. We transform each sequence of five numbers to the product of those five numbers by using the reduce function. Now we have a sequence of integers, where the first element is the product of elements 1-5, the second element is the product of elements 2-6, etc. We want the maximum such product. To do this, we use the max function. However, max expects multiple elements passed to it, not a single sequence. To turn the sequence into multiple elements to pass to max , we use the apply function. This produces the maximum that we wanted to solve the problem, and of course prints out the answer. Now you have solved several problems while learning how to use Clojure at the same time.

Summary

In this article, we have introduced the Clojure programming language and have benefited from the use of the Clojure plug-in for Eclipse. We took a brief look at some of its philosophies and features, but concentrated on code examples. In those simple examples, we have seen many of the core features of the language: functions, macros, bindings, recursion, lazy sequences, closures, comprehensions, and integration with Java technology. There are many more aspects to Clojure. Hopefully, the language has caught your attention, and you will take a look at some of the resources and learn more about it.

Downloadable resources

Related topics