Jun 24, 2016

Introduction

Null pointer exception (NPE) is known by different names in different programming languages. Null pointer dereference in C/C++, NullPointerException in Java, NullReferenceException in C# and .NET, and many other names in scripting languages like JavaScript’s “undefined is not a function” among them.

This error is in the top of programming bugs of all times. It’s a plague that exists in each and every application because of how popular programming languages and programs work from the ground up. As a software developer you either have to deal with null pointer exceptions every day, or they are just like mines silently waiting in code for a software user to step on it. Null pointer exception error is a “billion dollar mistake”:

This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.

Tony Hoare from the talk “billion dollar mistake“

Examples

The history of null pointer dereference started long time ago even before the UNIX/C era. To get the taste of null pointer exceptions, the examples provided are written in the top most used programming languages at the moment.

For the sake of simplicity the null (or undefined) variables are explicitly set. In real programs the null value will come as a result of an operation provided by another piece of code like another part of the program, a 3rd party library, framework or an OS. To make it worse, the value can be null (or undefined) at random based on various user input, database state, a device state or other environment conditions.

C/C++

Consider this code:

QTimer *timer = nullptr; timer->start(); // crash

Put this code into a console application to observe this:

Java

String name = null; name.toLowerCase(); // crash

Inside a JSP page powered by Jetty:

C# .NET

string name = null; name.ToLower(); // crash

Results in an infamous “yellow screen of death” in ASP.NET:

Null pointer error in dynamic programming languages

Popular dynamic scripting languages don’t have a concept of a pointer, but they have references. Dereferencing a null reference usually produces the same result - terminal failure. Moreover these programming languages have a concept of “undefined” (or “unset”) value. Technically speaking null and undefined are separate things, but in practice undefined values produce almost the same class of “null pointer exceptions” as true null values. Thus dynamic languages have even more null exceptions, because null and undefined are friends.

JavaScript

var element = null; element.getAttribute("id"); // crash var element = { getAttribute:null } element.getAttribute("id"); // crash

Errors in a browser console:

PHP

$name = null; $name->getMessage(); // crash

Python

name = None name.lower() # crash

What a lovely Django web page:

Ruby

name = nil name.downcase # crash

Testing in a Rails controller:

Perl

my $person; print $person->name; # crash

Chart

Let’s summarize the results of those obviously wrong little programs presented so far in a table:

Example Compiles* Crashes C++ YES YES Java YES YES C# YES YES JavaScript YES YES PHP YES YES Python YES YES Ruby YES YES Perl YES YES

*For dynamic languages “compiles” means that the syntax checker (or a bytecode compiler) is not able to detect the problem ahead of time.

Solutions

Manual null-checks

As a workaround programmers insert null-checks into the programs like so:

if (name != null) name.toLowerCase(); else ... // handle an error case

Such ad-hoc checks are not a general solution to the null pointer exception problem. It’s plain impossible to add such checks everywhere. It’s a burden to a programmer and it is error-prone to the human factor.

Sometimes being overconfident that a null value is impossible in a particular place you avoid checking, and much later get trapped by a null pointer dereference. On the flipside after experiencing lots of null pointer errors people fall into defensive programming and tend to add tons of checks whether it is needed or not.

Static analyzers

Static analysis with tools like Lint might help to find potential null pointer exceptions. Building such tools is a very complicated endeavour, because they must be smarter than the compiler (or parser) itself and infer missing information from the source code. Since languages allow null values everywhere, the static analyzers tend to produce lots of false positives. This makes fixing errors very time consuming and daunting. In addition a static analyzer being a separate tool does not prevent you from shipping bad code. Forcing warning-free passes upon developers can be a hard sell since the analysis is so fuzzy.

NonNull annotation

As an afterthought some programming languages introduced additional syntax (annotations or attributes) that hints compilers if something can be null or not. Java has @NotNull in various forms, C# has NotNullAttribute, Objective-C has nonnull .

Those annotations can help a bit, but it’s not a silver bullet. Most of the code was written without NonNull. It will take ages to standardize and add it everywhere. If a programming language uses null value as a default value, the annotations solution is an opt-out solution. It means that most of the cases are going to be left unpatched.

Option type

Option types are known as Nullable<T> , Optional<T> , Option<T> or Maybe T in different programming languages (where “T” denotes a data type). Such type lets you express an explicit intent that a value can potentially be null and requires an explicit cast if you want to get to a contained non-null value.

The option type protection works solidly in languages like Haskell or F# where a cast to get the non-null value requires adding an “if-else” clause which would deal with a null case. On the contrary, in the languages presented above though it’s very easy to try getting an underlying value without a check and be trapped by a null pointer exception error. In addition for those languages the nullable type solution is still opt-out like with the NonNull annotations.

Ignoring null pointer exceptions

The programming language examples discussed above are on one side of extremity: calling a null pointer produces a runtime crash, and you have to opt-out case by case to recover. On the other side of extremity is an idea to ignore a null dereference and continue execution. This is an opt-in solution where you have to produce an exception yourself if you need to.

For a programmer who is used to deal with null pointer exceptions this might sound crazy, but this idea has proven itself to work well in production. Objective-C runtime and thus most Mac OS and iOS applications are running this way. Several things are great about this strategy:

A program can recover itself from an unexpected error if a higher level code works with the null condition. In other cases an operation finishes as if nothing has happened, because later calls to null are ignored too.

are ignored too. Disgusting null pointer exception errors are not thrown upon a user.

The app stays alive and might continue doing other useful stuff.

Null-checks can be avoided producing more compact and readable while still correct code.

More examples

Objective-C

Objective-C doesn’t suffer from null pointer exceptions by simply ignoring it when it happens. For example:

NSString *name = nil; [[name lowercaseString] characterAtIndex:0];

In this chain of calls it first calls a method lowercaseString and then calls a method characterAtIndex on a return value. Since “name” is nil the first call is ignored and produces nil . Then the second call is ignored as well.

Objective-C is not an angel. Firstly, it is built on the C language foundation which abounds in null exceptions. Secondly, collections like NSArray and NSDictionary can’t contain null values, which might be coming in there by accident and produce “Attempt to insert nil object” exception that is pretty close to a null pointer exception:

NSString *name = nil; NSArray *names = @[ name ]; // crash

Go

Go language takes an approach of favouring value types (like structs) to pointer types. When the value types are used a null pointer exception is not possible, for example:

var moment time.Time moment.Day()

The value of the moment variable is not defined, but it automatically gets a default value and it’s possible to call a method. Unfortunately nothing checks if this was intentional or not.

Go language is still vulnerable to null pointer exception errors when using pointers. Unfortunately even simple API functions are unsafe:

var timer *time.Timer = nil timer.Reset(10) // crash var timer2 time.Timer timer2.Reset(15) // crash

The first example is a true null pointer exception. The second example is able to call into the Reset function, but crashes on a check inside it, saying that the timer2 variable value is not initialized:

Swift

Just as the Go language Swift favours value types. It is built on top of Objective-C foundation and the majority of APIs return nullable option types. Swift offers a safe idiomatic way to cast them with an if let statement that forces an explicit check. In addition to that an operator “ ?. ” provides a shortcut where you want to ignore the null-case in the spirit of Objective-C:

let name : String? = nil name?.lowercaseString.endIndex // ignored

On the downside Swift has an exclamation mark operator “ ! ” and a similar “ !. ” operator which ruins the deal:

let name : String? = nil name!.lowercaseString.endIndex // crash

This operator produces a null pointer exception that Swift tries to avoid so hard:

Conclusion

All modern popular widely used programming languages are vulnerable to null pointer exception errors: C++, Java, JavaScript, C#, Python, Ruby, PHP, Perl, Go, Objective-C and Swift. Some languages try harder to avoid it than the others, in particular: Go, Objective-C and Swift. This is “null pointer exception hell” - a reality that programmers have to deal with for upcoming years.

Bonus: exercise

Could you break SQL?

An error should not be detected until execution time and should be related to NULL or undefined values

Try to do it without using stored procedures

Any popular SQL implementation is fine.

Cover image by mikalsl under CC BY.