Dangerous designs

Programming language specifications often start with a design philosophy. Of all those I have read, I like that of the Scheme language most. You can read it in the introduction of Scheme standard , where it is stated as a single line:

Programming languages should be designed not by piling feature on top of feature, but by removing the weaknesses and restrictions that make additional features appear necessary.

The argument is that, using a very small number of rules for forming expressions and with a minimal syntax it is possible to support all possible programming paradigms. For instance, if the language has support for higher-order functions, closures and dynamic typing, we can implement object oriented programming without special language level syntactic support. Tail-call optimization elude the need for special looping constructs.

But this ideology has failed to capture the imagination of the majority of professional programmers. Instead the following thinking seems to have imbued their minds:

As more features are piled on top of an already bogged language, the more powerful will it become.

In this article I try to prove that this argument is false and misleading. Adding more syntax and restrictive rules to a language, which has a badly designed core, will only make it weaker and even susceptible to security risks. I will make my point clear with the help of a feature added to the Java programming language – Inner classes.

Inner classes are Java’s answer to Smalltalk’s blocks and Scheme’s closures. Look at the following code snippet:

public class OuterClass { // Inner class class AddN { AddN(int n) { _n = n; } int add(int v) { return _n + v; } private int _n; } public AddN createAddN(int var) { return new AddN(var); } }

The method createAddN() takes an integer ‘n’ and return an object that adds ‘n’ to a value. That object is defined as an inner class whose local state stores the value of ‘n’. Those who are familiar with Scheme or Common Lisp should be shouting by now: “Hey, this can be done more elegantly and compactly, like this-”

(define (addn n) (lambda (k) (+ n k)))

No new syntax to learn, no special rules to remember.

Agreed.

The problem is Java does not have first-class functions and closures, and the JVM is not designed to support them. But they are really nice features and are the natural solutions for many a programming problem. So as to have these nice features, Java introduced a restrictive rule to the language. The following example will throw more light on this:

public class OuterClass { public int add10(final int n) { class Add10 { int add() { return 10 + n; } } return new Add10().add(); } }

Here we have a method add10(), that contain a local inner class, which is used to add the constant value ’10’ to a given value. But why the parameter to add10() is declared final? Problem is, the JVM has no idea of inner classes. So the Java compiler will generate a separate class file for the inner class. Now how to pass a local variable declared in Outerclass.add10() method to the Add10 class? The compiler does a trick here. You can find this out by decompiling the OuterClass$1Add10.class file. The compiler quietly adds a variable val$n to the Add10 class. When an instance of Add10 is created, the value of ‘n’ is copied to val$n. The JVM needs a guarantee that the original value of ‘n’ will not change after this copying is done because there is no way for it to keep track of those changes. Requiring the programmer to declare the variable final is the only way out of this problem.

An inner class should be able to read and write all variables of it’s parent class, despite what access modifiers they have. Otherwise, there is no point in making it an inner class in the first place! The following code demonstrates this. Here, the inner class is copying the result of the computation to a private variable of the outer class:

public class OuterClass { public void add10(final int n) { class Add10 { void add() { k = 10 + n; } } new Add10().add(); } private int k = 0; }

Now we face a new dilemma. If, for the JVM, both OuterClass and Add10 are two unrelated classes, how an instance of Add10 is able to modify a private variable declared in OuterClass? The answer can be found by decompiling both OuterClass.class and OuterClass$1Add10.class files. We see that the compiler has secretly placed a new method with package level access in OuterClass.class file:

int access$002(OuterClass aOuterClass3, int int4) { this.k = aOuterClass3; return aOuterClass3; }

Using this method, not just Add10, but any class in the package can see and modify the private variable OuterClass.k! If you generate a class file with appropriate byte code and place it in the same package as OuterClass.class, you can read from and write to its internal state using these secret access methods!

By adding a new feature Java has broken one of the key premises that identify it as an Object Oriented language, i.e, retention and protection of local state. This may not be a security problem. No one should rely on Object Oriented abstractions for securing their data anyway! But this might still cause problems for certain types of software and is certainly a hole in the language.

Higher-order functions and closures are features to be desired by any modern programming language. Unfortunately, many ‘modern’ programming languages have such rigid a design so that adding a new feature to it often breaks an existing, important feature.

I think this is where comparatively simple languages like Common Lisp and Scheme shine. You can add new syntax, even whole new paradigms, without touching or spoiling the compiler or the runtime system. As an example, read about how the Common Lisp Object System (CLOS) is implemented.

Good languages let you write terse, clean code. Look at the Scheme code snippet I gave at the beginning of this article. Compare that with all the verbosity in Java, just to get the same result. Of course, you will get what you want, only with a fissure in your program!