Gergely Nemeth Co-Founder of RisingStack, EM at Uber

Finding a Node.js memory leak can be quite challenging - recently we had our fair share of it.

One of our client's microservices started to produce the following memory usage:

Memory usage grabbed with Trace

You may spend quite a few days on things like this: profiling the application and looking for the root cause. In this post, I would like to summarize what tools you can use and how, so you can learn from it.

UPDATE: This article mentions Trace, RisingStack’s Node.js Monitoring platform several times. On 2017 October, Trace has been merged with Keymetrics’s APM solution. Click here to give it a try!

The TL;DR version

In our particular case the service was running on a small instance, with only 512MB of memory. As it turned out, the application didn't leak any memory, simply the GC didn't start collecting unreferenced objects.

Why was it happening? As a default, Node.js will try to use about 1.5GBs of memory, which has to be capped when running on systems with less memory. This is the expected behaviour as garbage collection is a very costly operation.

The solution for it was adding an extra parameter to the Node.js process:

node --max_old_space_size=400 server.js --production

Still, if it is not this obvious, what are your options to find memory leaks?

Understanding V8's Memory Handling

Before diving into the technics that you can employ to find and fix memory leaks in Node.js applications, let's take a look at how memory is handled in V8.

Definitions

resident set size : is the portion of memory occupied by a process that is held in the RAM, this contains: the code itself the stack the heap

: is the portion of memory occupied by a process that is held in the RAM, this contains: stack : contains primitive types and references to objects

: contains primitive types and references to objects heap : stores reference types, like objects, strings or closures

: stores reference types, like objects, strings or closures shallow size of an object : the size of memory that is held by the object itself

: the size of memory that is held by the object itself retained size of an object: the size of the memory that is freed up once the object is deleted along with its' dependent objects

How The Garbage Collector Works

Garbage collection is the process of reclaiming the memory occupied by objects that are no longer in use by the application. Usually, memory allocation is cheap while it's expensive to collect when the memory pool is exhausted.

An object is a candidate for garbage collection when it is unreachable from the root node, so not referenced by the root object or any other active objects. Root objects can be global objects, DOM elements or local variables.

The heap has two main segments, the New Space and the Old Space. The New Space is where new allocations are happening; it is fast to collect garbage here and has a size of ~1-8MBs. Objects living in the New Space are called Young Generation. The Old Space where the objects that survived the collector in the New Space are promoted into - they are called the Old Generation. Allocation in the Old Space is fast, however collection is expensive so it is infrequently performed .

Why is garbage collection expensive? The V8 JavaScript engine employs a stop-the-world garbage collector mechanism. In practice, it means that the program stops execution while garbage collection is in progress.

Usually, ~20% of the Young Generation survives into the Old Generation. Collection in the Old Space will only commence once it is getting exhausted. To do so the V8 engine uses two different collection algorithms:

Scavenge collection, which is fast and runs on the Young Generation,

Mark-Sweep collection, which is slower and runs on the Old Generation.

For more information on how this works check out the A tour of V8: Garbage Collection article. For more information on general memory management, visit the Memory Management Reference.

The heapdump module

With the heapdump module, you can create a heap snapshot for later inspection. Adding it to your project is as easy as:

npm install heapdump --save

Then in your entry point just add:

var heapdump = require('heapdump');

Once you are done with it, you can start collecting heapdump with either using the $ kill -USR2 <pid> command or by calling:

heapdump.writeSnapshot(function(err, filename) { console.log('dump written to', filename); });

Once you have your snapshots, it's time to make sense of them. Make sure you capture multiple of them with some time difference so you can compare them.

First you have to load your memory snapshots into the Chrome profiler. To do so, open up Chrome DevTools, go to profiles and Load your heap snapshots.

Once you loaded them it should be something like this:

So far so good, but what can be seen exactly in this screenshot?

One of the most important things here to notice is the selected view: Comparison. This mode enables you to compare two (or more) heap snapshots taken at different times, so you can pinpoint exactly what objects were allocated and not freed up in the meantime.

The other important tab is the Retainers. It shows exactly why an object cannot be garbage collected, what is holding a reference to it. In this case the global variable called log is holding a reference to the object itself, preventing the garbage collector to free up space.

mdb

The mdb utility is an extensible utility for low-level debugging and editing of the live operating system, operating system crash dumps, user processes, user process core dumps, and object files.

gcore

Generate a core dump of a running program with process ID pid.

Putting it together

To investigate dumps, first we have to create one. You can easily do so with:

gcore `pgrep node`

After you have it, you can search for the all the JS Objects on the heap using:

> ::findjsobjects

Of course, you have to take successive core dumps so that you can compare different dumps.

Once you identified objects that look suspicious, you can analyze them using:

object_id::jsprint

Now all you have to do is find the retainer of the object (the root).

object_id::findjsobjects -r

This command will return with id of the retainer. Then you can use ::jsprint again to analyze the retainer.

For a detailed version check out Yunong Xiao's talk from Netflix on how to use it:

Recommended Reading

UPDATE: Read the story of how we found a memory leak in our blogging platform by comparing heapshots with Trace and Chrome's DevTools.