A couple of weeks ago I spent a considerable amount of time chasing down bugs involving null in a large code base: null checks after a variable had already been dereferenced, nulls passed to methods that would immediately dereference them, equals() methods that didn’t check for null, and more. Using FindBugs, I identified literally hundreds of bugs involving null handling; and that got me thinking: Could we just eliminate null completely? Should we?

What follows is a thought experiment, not a serious proposal. Still it might be informative to think about it; and perhaps it will catch the eye of the designer of the next great language.



To explore this, let’s reverse perspective and think about primitive types for a bit. I’ve long advocated a completely object-based type system. The distinction between primitive and object types is a relic of days when 40 MHz was considered a fast CPU; and even then it reflected more the lack of experience with OO optimization than any real issue. Languages such as Eiffel perform quite well without separating out primitive types. But if we were to make int and double and boolean full object types, would it then be necessary to allow null to be assigned to these types as well:

int i = null; double x = null; boolean b = null;

I suppose we could just rule out assigning null to these types but this would be an unnatural distinction between primitive and object types, which is precisely what we’re trying to avoid. Still I’d hate to give up the default values for unassigned primitives, 0 for numbers, false for booleans. After all it would be really annoying to have to write a method to add two numbers like this:

public static double sum(double x, double y) { if (x == null) throw new NullPointerException(); if (y == null) throw new NullPointerException(); return x + y; }

One of the nicest characteristics of primitive data types is precisely that you don’t have to worry about this. You know they’re always initialized.

So suppose we go the other way instead. Let’s allow or even require each class to define a default value object that will be used whenever a null variable of that type is dereferenced. For instance, for the String class the empty string is the obvious choice. Perhaps it could be defined by overloading the existing default keyword:

public final class String { default = ""; // rest of String class here... }

Or for a ComplexNumber class, the default might be 0+0i:

public class ComplexNumber { private realPart; private imaginaryPart; default = new ComplexNumber(0, 0); // rest of class here... }

Then any dereferences would simply use the default value instead of throwing a NullPointerException .

public class ComplexValue { private ComplexNumber z; public void ComplexNumber increment(ComplexNumber delta) { z = z.add(delta); }

In the current regime, this throws a NullPointerException unless some other code has initialized z and delta is not null. But in this scheme it would simply adds 0+0i if either is null.

Sometimes, though, you might really want to forbid uninitialized values. Or perhaps there is no sensible default value. For instance, what would be a sensible default value for a java.net.Socket ? To indicate that no default is available, simply omit the default declaration from the class. There are two incompatible ways we could interpret this: either this means a null dereference throws a NullPointerException (current behavior); or it means the compiler forbids any declaration of such a variable without an immediate assignment. Pick one. I’m not sure which of these makes more sense, though I think I prefer the latter. No accidental NullPointerException s is the goal.

Of course, sometimes three-valued logic is sensible. After all, very few things are really true or false. True, false, and unknown is a much more accurate division. Integers and floating point values can also benefit from an otherwise impossible value that can represent unknown, unset, or uncertain. To enable this, we can allow any value to be tested for equality with default like so:

if (x == default) { // Special handling for the default case. // We could even throw an exception here, // but that would only be by deliberate choice, // not something that happens unexpectedly. }

This comparison would return true if and only if the variable x were set to the default object for its type. It would not return true if x merely had the same value. For example,

String s1; // assumes the default value String s2 = ""; // explicitly set to the default value if (s1 == default) { System.out.println("This branch is executed."); } if (s2 == default) { System.out.println("This branch is not executed."); }

This proposal could even be added to the existing Java language. However backwards compatibility would only make it feasible for new types. Unless we can apply this to existing types like String , Integer , and Document , I don’t think it’s a strong enough idea to carry its own weight. However in a new language without any such constraints, it could dramatically increase program robustness by eliminating an entire class of common errors.

Cue John Lennon.