Last time on FAIC I described how language designers and compiler writers use “lowering” to transform a high-level program written in a particular language into a lower-level program written using simpler features of the same language. Did you notice what this technique presupposes? To illustrate, let’s take a look at a line of the C# specification:

the operation e as T produces the same result as e is T ? (T)(e) : (T)null except that e is only evaluated once.

This seems bizarre; why does the specification say that a higher-level operation is exactly the same as a lower-level operation, except that it isn’t? Because there is no way to express the exact semantics of as in legal C# in any lowered form! The feature “evaluate a subexpression to produce its side effects and value once and then use that value several times throughout its containing expression” does not exist in C#, so the specification is forced to describe the lowering in this imprecise and roundabout way.

A nice principle of programming language design that C# does not adhere to is that all the higher-level features of the program can be expressed as lower-level features. Why is this nice to have? Well, not only is it a big convenience for writers of the specification, it also is a big convenience for writers of language analyzers. If you can programmatically transform a high-level language into an exactly equivalent program in a language with far fewer complicated concepts then your analyzer gets simpler.

For example, suppose you are writing code that wishes to determine if a particular line of code is reachable. There are two basic approaches. You could write an analyzer that understands the control flows associated with try, catch, finally, throw, goto, if, else, return, using, lock, for, foreach, do, while, switch, yield break, break and continue keywords. Or, you could write a lowering pass that turned using, lock, for, foreach, do, while, switch, yield break, break and continue into their lowered forms; all of these can be expressed by some combination of the others. Now your analyzer has fewer things to analyze.

This is why I am very pleased by the proposed addition of two kinda weird little features to C# 6. The first is introduction of new variables inside expressions, and the second is sequential composition of expressions.

Today in C# the only two ways to introduce a new local variable within the body of a method are to make a declaration statement:

int x = 123;

or to introduce a parameter to an anonymous function

M( (int x) => x + 1 );

The former requires a statement; the latter brings the variable into scope only inside the anonymous function’s body. (A third technique is to use a let clause in a query expression, but this does not actually produce a variable as it cannot change.) By itself this feature is not particularly useful, but when combined with the sequential composition operator, it becomes very powerful. The semantics of the sequential composition operator are very straightforward: take two expressions, separated by a semicolon. The operator evaluates the left, discards the value if there was one, evaluates the right, and uses the value as its result.

The sequential composition operator is known as the “comma operator” in C, C++ and JavaScript, but I’ve never liked that syntax as it is too easily confused with the comma used to separate arguments in a function invocation. (C# and Java can use a comma to separate loop variable declarations in a for loop, but this feature is not in widespread usage.) Since the semicolon is already used to mean sequential composition of statements, it makes sense to use it as also the sequential compositiion operator for expressions.

This means that the specification can now be reworded as

the operation e as T where e is of compile-time type V produces the same result as (V temp = e ; temp is T ? (T)(temp) : (T)null) .

There is no longer any need to qualify the statement; this is the lowering of the operator, end of story. Well, almost: of course temp is understood to be an unique compiler-generated name unused by the method.

I think it is a really nice property for a language to have, that its high-level features can be entirely specified in terms of its low-level features. The addition of these two new features to C# 6 doesn’t get it quite all the way there; we’d also need to add the ability to make a ref local variable in an variable declaration expression for some features, as well as a few other low-level abilities that are not exposed by the C# language. But this is definitely a step in a good direction.