The New Minimalism

You don't know minimalism until you've spent time in the Forth community. There are recurring debates about whether local variables should be part of the language. There are heated discussions about how scaled integer arithmetic is an alternative to the complexity of floating point math. I don't mean there were those debates back in the day; I mean they still crop up now and again. My history with Forth and stack machines explains the Forth mindset better than I can, but beware: it's a warning as much as a chronology.

Though my fascination with Forth is long behind me, I still tend toward minimalist programming, but not in the same, extreme, way. I've adopted a more modern approach to minimalism:

Use the highest-level language that's a viable option. Lean on the built-in features that do the most work. Write as little code as possible.

The "highest-level language" decision means you get as much as possible already done for you: arbitrary length integers, unicode, well-integrated data structures, etc. Even better are graphics and visualization capabilities, such as in R or Javascript.

"Lean on built-in features," means that when there's a choice, prefer the parts of the system that are both fast--written in C--and do the most work. In Perl, for example, you can split a multi-megabyte string into many pieces with one function call, and it's part of the C regular expression library. Ditto for doing substitutions in a large string. In Perl/Python/Ruby, lean on dictionaries, which are both flexible and heavily optimized. I've seen Python significantly outrun C, because the C program used an off-the-cuff hash table implementation.

I've been mostly talking about interpreted languages, and there are two ways to write fast interpreters. The first is to micro-optimize the instruction fetch/dispatch loop. There are a couple of usual steps for this, but there's only so far you can go. The second is to have each instruction do more, so there are fewer to fetch and dispatch. Rule #2 above is taking advantage of the latter.

Finally, "write as little code as possible." Usual mistakes here are building a wrapper object around an array or dictionary and representing simple types like a three-element vector as a dictionary with x, y, and z keys, or worse, as a class. You don't need a queue class; you've already got arrays with ways to add and remove elements. Keep things light and readable at a glance, where you don't have to trace into layers of functions to understand what's going on. Remember, you have lots of core language capabilities to lean on. Don't insist upon everything being part of an architecture or framework.

This last item, write less code, is the one that the other two are building toward. If you want people to be able to understand and modify your programs--which is the key to open source--then have less to figure out. That doesn't mean fewer characters or lines at all costs. If you need a thousand lines, then you need a thousand lines, but make those thousand lines matter. Make them be about the problem at hand and not filler. Don't take a thousand lines to write a 500 line program.

(If you liked this, you might enjoy The Software Developer's Sketchbook.)

permalink October 12, 2016

previously