In Praise of Non-Alphanumeric Identifiers

Here's a common definition of what constitutes a valid identifier in many programming languages:

The first character must be any letter (A-Z, a-z) or an underscore. Subsequent characters, if any, must be a letter, digit (0-9), or an underscore.

Simple enough. It applies to C, C++, ML, Python, most BASICs, most custom scripting languages (e.g., Game Maker Language). But of course there's no reason for this convention other than being familiar and expected.

One of my favorite non-alphanumeric characters for function names is "?". Why say is_uppercase (or IsUppercase or isUppercase ) when you can use the more straightforward Uppercase? instead? That's standard practice in Scheme and Forth, and I'm surprised it hasn't caught on in all new languages.

(As an aside, in Erlang you can use any atom as a function name. You can put non-alphanumeric characters in an atom if you remember to surround the entire name with single quotes. It really does work to have a function named 'uppercase?' though the quotes make it clunky.)

Scheme's "!" is another good example. It's not mnemonic, and it doesn't carry the same meaning as in English punctuation. Instead it was arbitrarily designated a visual tag for "this function destructively updates data": set! , vector-set! . That's more concise than any other notation I can think of ("-m" for "mutates"? Yuck).

Forth goes much further, not only allowing any ASCII character in identifiers, but there's a long history of lexicographic conventions. The fetch and store words--"@" and "!"--are commonly appended to names, so color@ is read as "color fetch." That's a nice alternative to "get" and "set" prefixes. The Forthish #strings beats "numStrings" any day. Another Forth standard is including parentheses in a name, as in (open-file) , to indicate that a word is low-level and for internal use only.

And then there are clever uses of characters in Forth that make related words look related, like this:

open{ write-byte write-string etc. }close

The brace is part of both open{ and }close . There no reason the braces couldn't be dropped completely, but they provide a visual cue about scope.

permalink February 25, 2008

previously