The How and the Why of Default Values

Experienced programmers have had it beaten over their heads to always initialize their variables, which is very useful for those who have used C or an equivalent systems-level language. In designing a compiler, one must be aware of the performance trade-off that a default value may present: if there are n uninitialized local variables, setting them all to a default value requires at least n instructions to actually store in memory. For system languages like C, this decision is easy — it’s far better to require the user to explicitly set the memory to what they want, rather than possibly storing a useless value.

In Java, local variables are stored in what is called a stack frame, which for simplicity sake we will say looks roughly like this:

In this example, foo and bar each have their own stack frame. Bar’s return address the instruction address at which foo called bar, and each of foo and bar will store their own local variables in the “local variables” portion.

To create this stack frame, you must request a sufficient amount of memory from the Operating System, and then begin filling in the data that you know — the return address, the current frame pointer, etc. Variables are stored at some position in the “local variables” section, with their value being the real value of the variable.

When you request memory from the Operating System, it returns a block of memory that some other program may have previously used, without making an attempt to clean it up. That is, if the last person to use your memory wrote the value 5 to the first byte, it will still have the value of 5 when you get it. This is, then, where we run into issues with default values — the space where “local variables” go is going to be occupied by whatever used to be in the memory you requested. For the compiler to set each variable to a default value, it would need to then iterate through the stack frame and set the bytes to zero.

In Java, arrays and Objects both are stored in variables as references (that is, the addresses that their actual data lives at), with their actual memory being a block of the appropriately requested size, which is then overwritten with zeros. That is, it’s not up to the constructor to actually zero-out data in an Object, it’s done for you once the memory is requested. Why don’t we treat Objects and arrays in the same way that we do local variables, then?

I explored this design decision quite a bit, and came up with a lot of unsatisfying arguments, primarily centered around the idea that we can’t deduce when various methods in a class will be called, and therein can’t know if a variable will be referenced before its declaration (imagine calling a getter before a setter, or a setter before a getter). This leaves an unsatisfying taste in my mouth, though, because the ambiguity is equally as valid in a single method:

IO.readInt() will read an integer as input from console and return it. In this case, we don’t know whether we will do a read or a write first until runtime, which is equivalent to as in objects.

The only other answer that I can think of is pure speed, but I’m not sure that’s an appropriate answer for languages like Java, either. It is important that we think about how the average programmer works with a language — in my experience reading other people’s code, I’ve noticed two things: 1) classes generally have many (read: too many) fields, and 2) people tend to enjoy creating many more objects (read: ButtonFactoryFactories, BananaFactorySingleton, the works) than they need. With this in mind, let’s consider that performance hit again: if the number of variables associated with all of the objects created during the duration of the program are greater than or equal to all of the local variables from every function call, you’ve already agreed to take a performance hit. Numerically, if there are c objects created, with an average of n fields each, and there are d function calls, with an average of m variables each, we are comparing c*n and d*m bytes of memory that need to be zeroed out.