To blog Previous post | Next post

What is a memory leak?

When we talk to people about our solution for discovering memory leaks we immediately get positive feedback. But when we add Java into the equation, the initial excitement is often complemented with questions: “Are there memory leaks in Java? Isn’t Java a garbage-collected language?”

In this post I will explain why memory leaks are in fact a common problem for Java applications.

Looking for an easy solution to a Java memory leak? Plumbr automatically detects the leak and tells you how to solve it.

What is a “Memory leak” in Java?

Let us start by outlining the difference between memory management in Java and, for example, C languages. When a C-programmer wants to use a variable, he has to manually allocate a region in the memory where the value will reside. After the application finishes using that value, the region of the memory must be manually freed, i.e. the code freeing the memory has to be written by the developer. In Java, when a developer wants to create and use a new object using, e.g. new Integer(5), he doesn’t have to allocate memory – this is being taken care of by the Java Virtual Machine (JVM). During the life of the application JVM periodically checks which objects in memory are still being used and which are not. Unused objects can be discarded and memory reclaimed and reused again. This process is called garbage collection and the corresponding piece of JVM is called a Garbage Collector or GC.

Java’s automatic memory management relies on GC which periodically looks for unused objects and removes them. And here hides the dragon. Simplifying a bit, we can say that a memory leak in Java is a situation where some objects are not used by the application any more, but GC fails to recognize them as unused. As a result, these objects remain in memory indefinitely, reducing the amount of memory available to the application.

Here I would like to stress one very important point: the notion of “object is not used by the application any more” is totally, absolutely, 100% application-specific! Apart from some specific cases, where the lifespan of the object can be logically determined (such as the local variable of the method, which does not under any circumstances escape the method), object usage can be understood only by the application developer taking into account all usage patterns of the application.

How can GC distinguish between the unused objects and the ones the application will use at some point in time in the future? The basic algorithm can be described as follows:

There are some objects which are considered “important” by GC. These are called GC roots and are (almost) never discarded. They are, for example, currently executing method’s local variables and input parameters, application threads, references from native code and similar “global” objects. Any object referenced from those GC roots are assumed to be in use and not discarded. One object can reference another in different ways in Java, most commonly being when object A is stored in a field of object B. In that case, we say “B references A” The above process is repeated until all objected that can be transitively reached from GC roots are visited and marked as “in use” Everything else is unused and can be thrown away.

Now, it is fairly easy to construct a Java program that satisfies the above definition of a memory leak:

public class Calc { private Map cache = new HashMap(); public int square(int i) { int result = i * i; cache.put(i, result); return result; } public static void main(String[] args) throws Exception { Calc calc = new Calc(); while (true) System.out.println("Enter a number between 1 and 100"); int i = readUserInput(); //not shown System.out.println("Answer " + calc.square(i)); } } }

This program reads one number at a time from its user and calculates its square value. This implementation uses a primitive “cache” for storing the results of the calculation. But since these results are never read from the cache, the code block represents a memory leak according to our definition above. If we let this program run and interact with users long enough, the “cached” results consume a lot of memory.

Did you know that 20% of Java applications have memory leaks? Don’t kill your application – instead find and fix leaks with Plumbr in minutes.

This brings us to another important aspect of memory leaks: how big should the leak be to justify the trouble of investigating and fixing it? Technically, whenever you leave an object that you don’t use anymore lying around, you create waste. Practically, a couple of kilobytes here and there don’t really constitute real problems for modern applications, especially the “enterprise” ones … But a leak is a leak, even if it’s just 200 bytes.

Which leads us to a simple corollary: a memory leak is like fine wine – it needs aging. If you want to demonstrate the leak or, more importantly, fix it, you really should let it grow. Tiny memory leaks are lost within all those objects that are present in an application at any given point of time. Regardless of the tool you use for identify memory leaks – be it a profiler, a memory dump analyzer, an APM, or a special-purpose leak finder tool like Plumbr – there should be a lot of objects that outlived their usefulness. Which means that your application should run for a significant period of time AND as many different parts of your application should be executed as possible. Otherwise you will be looking for a needle in a haystack.

If you would like to know more about Java memory leaks, especially about the different ways to hunt them down and fix them in your applications, check out our series of blog posts, titled “Solving OutOfMemoryError”. And stay tuned to our twitter @JavaPlumbr. Till next time!