The Highest-Level Feature of C

At first blush this is going to sound ridiculous, but bear with me: the highest-level feature of C is the switch statement.

As any good low-level language should be, C is designed for transparent compilation. If you take a bit of C source, the corresponding object code emitted by the compiler--even a heavily optimizing compiler--roughly mimics the structure of the original text.

The switch statement is the only part of the language where you specify an intent, and the choice of how to make that a reality is not only out of your hands, but the resulting code can vary in algorithmic complexity.

Sure, there are other situations where the compiler can step in and reinterpret things. A for loop known to execute three times can be replaced by three instances of the loop body. In some circumstances, if you're careful not to trip over all the caveats, a loop can be vectorized so multiple elements can be processed in each iteration. None of these are fundamental changes. Your loop is still conceptually a loop, one way or another.

The possibilities when compiling a switch are much more varied. It can result in a trivial series of if..else statements. It can result in a binary search. Or, if the values are consecutive, a jump table. Or for a complex sequence, some combination of these techniques. If each case simply assigns a different value to the same variable, then it can be implemented as a range check and array lookup. The overall sweep of the solutions, from hundreds of sequential, mispredicted comparisons to a single memory read, is substantial.

The same principle is what makes pattern matching so useful in Erlang and Haskell. You provide this great, messy bunch of patterns containing a mix of numbers and lists and tuples and "don't care" values. At compile time the commonalities, exceptional cases, and opportunities for table lookups are sorted out, and fairly optimally, too.

In the compiled code for this bit of Erlang, the tuple size is used for dispatching to the correct line:

case Position of {X, Y, Dir} -> ... {X, Y, Dir, _, _} -> ... {X, Y, _, _} -> ... {X, Y} -> ... end

The switch statement in C is a signal that even though you could do it yourself, you'd prefer to have the compiler act as a robotic assistant who'll take your spec--a list of values and actions--and write the code for you.

(If you liked this, you might enjoy On Being Sufficiently Smart.)

permalink February 14, 2013

previously