I write a lot of little conversion programs. They take a command-line argument or two, loop over a series of files, read them, convert them, manipulate them, mangle them, then write them out elsewhere. It's the Unix filter pattern (and, one might argue, the functional programming pattern).

These programs tend to be, at most, 100 lines of code, with significant whitespace.

I often start by writing a couple of helper functions, one to find the names of all of the interesting files, one to read a file, one to process the input names into output names, and one to write a file. I should abstract this whole process into a reusable framework, but I haven't figured out the appropriate genericity yet.

The important point is that I start with names:

my $scenes = get_scene_list(); for my $chapter (get_chapter_list()) { my $text = process_chapter( $chapter, $scenes ); write_chapter( $chapter, $text ); } die( "Scenes missing from chapters:", join "

\t", '', keys %$scenes ) if keys %$scenes; exit; sub get_chapter_list { ... } sub get_scene_list { ... } sub process_chapter { ... } sub read_scene { ... } sub write_chapter { ... }

This particular program is 88 lines of code, with copious whitespace and BSD-style brace placement. There's no functional reason to write it with functions instead of straight-line code that operates on global variables. There's only aesthetic practicality.

Names matter.

I don't have to show you the contents of any of these functions because their names describe what they do. They don't tell you how they do what they do, but you can get a sense of the organization and intent of the code by reading the simple control flow here.

Very little novice code I've seen makes the attempt at organized, named structure. I can appreciate that design and abstraction and factoring are all skills learned through practice just as much as is programming effectively in a language. Even still, these are important skills to learn.

I've heard a lot of people try to explain subroutines as "Pieces of reusable code". That's wrong; I think the Forth and Lisp and domain-driven design people have it right here. A subroutine is a way to name a set of behavior. It's an abstraction. Being able to identify and name individual sets of behavior is essential to being able to solve problems well.

Thinking in terms of sets of behavior -- individual units of behavior -- is essential to programming well.