For better or worse, we count lines of code. We (hopefully) feel a little dirty when we do it, because it's a flawed metric. But we do it. We do it on large codebases and small code snippets. Even badass hacker David Nolen does it.

I count lines because I try to write elegant code. To me, elegance is not merely a fun puzzle-solving indulgence. Rather, it's the most practical way to reduce complexity and reason about correctness.

A concise line count is not exactly the same as elegance, but it's correlated enough to make it tempting to count lines.

For a single function, it's intuitive when it becomes more concise. No need to count lines. But recently I completed a 3-month refactor that touched hundreds of files. I was curious: did I net add or remove code? I wanted to count lines to know the answer.

I also count lines when choosing libraries. When you depend on a library, the library code becomes your code in a way. You might need to read it, fix it, or add to it. And I want as little code as possible. If two libraries do roughly the same thing, but one has far fewer lines, I will usually prefer the smaller library.

So if we're going to count lines, we might as well do it accurately.

Counting Lines Accurately

The cloc unix tool is the obvious first thing to try, but it miscounts Clojure lines because of the different ways to write comments:

Multi-line (comment ...) forms.

forms. Namespace and function docstrings.

"Ignore next form" reader macro #_ .

. Semicolons.

cloc detects Clojure comments only by checking for semicolons.

To accurately count lines of Clojure, we should just parse the source files and look at what is actually code. I wrote a tool to do this:

lein-count (sic)

This lein plugin counts lines in your own project, or in any Clojure artifact.

For example, lein-count counting its own lines:

$ lein count :artifact lein-count 1.0.8 Found 4 source files. |------+-------+---------------+-------| | Ext | Files | Lines of Code | Nodes | |------+-------+---------------+-------| | clj | 4 | 1080 | 5623 | | ____ | _____ | _____________ | _____ | | SUM: | 4 | 1080 | 5623 | |------+-------+---------------+-------|

Syntax Nodes

Even when counted accurately, lines are a crude metric for complexity. Obviously

(f a b c)

is the same "amount" of code as

(f a b c)

In Succinctness is Power, Paul Graham discusses another metric, the number of nodes in the syntax tree. I think this is a better metric, and for Clojure code it's easy to measure. So lein-count does it too.

When Paul discusses this metric, he is using it to compare the relative succinctness of different languages, not the relative succinctness of different programs written in the same language.

Nevertheless, I think node count is reasonable for comparing the succinctness of Clojure programs. Especially when the author is not trying to game the system.

I spend most of my programming time trying to write as little code as possible. I hope lein-count will be useful in measuring progress toward that end.

Reader Comments

Enable JavaScript to view the comments powered by Disqus.