Peter Czibik Senior Engineer at RisingStack

There are times when the performance of JavaScript is not enough, so you have to depend more on native Node.js modules.

While native extensions are definitely not a beginner topic, I'd recommend this article for every Node.js developer to get a bit of knowledge on how they work.

With Node.js at Scale we are creating a collection of articles focusing on the needs of companies with bigger Node.js installations, and developers who already learned the basics of Node.

Common use cases of native Node.js Modules

The knowledge on native modules comes handy when you're adding a native extension as a dependency, which you could have done already!

Just take a look at the list of a few popular modules using native extensions. You’re using at least one of them, right?

There are a few reasons why one would consider writing native Node.js modules, these include but not limited to:

Performance-critical applications: Let's be honest, Node.js is great for doing asynchronous I/O operations, but when it comes to real number-crunching, it's not that great of a choice.

Hooking into lower level (e.g.: operating-system) APIs

Creating a bridge between C or C++ libraries and Node.js

What are the native modules?

Node.js Addons are dynamically-linked shared objects, written in C or C++, that can be loaded into Node.js using the require() function, and used just as if they were an ordinary Node.js module. - From the Node.js documentation

This means that (if done right) the quirks of C/C++ can be hidden from the module’s consumer. What they will see instead is that your module is a Node.js module - just like if you had written it in JavaScript.

As we've learned from previous blog posts, Node.js runs on the V8 JavaScript Engine, which is a C program on its own. We can write code that interacts directly with this C program in its own language, which is great because we can avoid a lot of expensive serialization and communication overhead.

Also, in a previous blogpost we've learnt about the cost of the Node.js Garbage Collector. Although Garbage Collection can be completely avoided if you decide to manage memory yourself (because C/C++ have no GC concept), you’ll create memory issues much easier.

Writing native extensions requires knowledge on one or more of the following topics:

All of those have excellent documentation. If you’re getting into this field, I’d recommend to read them.

Without further ado, let's begin:

Prerequisites

Linux:

python (v2.7 recommended, v3.x.x is not supported)

make

A proper C/C++ compiler toolchain, like GCC

Mac:

Xcode installed: make sure you not only install it, but you start it at least once and accept its terms and conditions - otherwise it won't work!

Windows

Run cmd.exe as administrator and type npm install --global --production windows-build-tools - which will install everything for you.

OR

Install Visual Studio (it has all the C/C++ build tools preconfigured)

OR

Use the Linux subsystem provided by the latest Windows build. With that, follow the LINUX instructions above.

Creating our native Node.js extension

Let’s create our first file for the native extension. We can either use the .cc extension that means it’s C with classes, or the .cpp extension which is the default for C++. The Google Style Guide recommends .cc , so I’m going to stick with it.

First, let’s see the file in whole, and after that, I’m going to explain it to you line-by-line!

#include <node.h> const int maxValue = 10; int numberOfCalls = 0; void WhoAmI(const v8::FunctionCallbackInfo<v8::Value>& args) { v8::Isolate* isolate = args.GetIsolate(); auto message = v8::String::NewFromUtf8(isolate, "I'm a Node Hero!"); args.GetReturnValue().Set(message); } void Increment(const v8::FunctionCallbackInfo<v8::Value>& args) { v8::Isolate* isolate = args.GetIsolate(); if (!args[0]->IsNumber()) { isolate->ThrowException(v8::Exception::TypeError( v8::String::NewFromUtf8(isolate, "Argument must be a number"))); return; } double argsValue = args[0]->NumberValue(); if (numberOfCalls + argsValue > maxValue) { isolate->ThrowException(v8::Exception::Error( v8::String::NewFromUtf8(isolate, "Counter went through the roof!"))); return; } numberOfCalls += argsValue; auto currentNumberOfCalls = v8::Number::New(isolate, static_cast<double>(numberOfCalls)); args.GetReturnValue().Set(currentNumberOfCalls); } void Initialize(v8::Local<v8::Object> exports) { NODE_SET_METHOD(exports, "whoami", WhoAmI); NODE_SET_METHOD(exports, "increment", Increment); } NODE_MODULE(module_name, Initialize)

Now let's go through the file line-by-line!

#include <node.h>

Include in C++ is like require() in JavaScript. It will pull everything from the given file, but instead of linking directly to the source, in C++ we have the concept of header files.

We can declare the exact interface in the header files without implementation and then we can include the implementations by their header file. The C++ linker will take care of linking these two together. Think of it as a documentation file that describes contents of it, that can be reused from your code.

void WhoAmI(const v8::FunctionCallbackInfo<v8::Value>& args) { v8::Isolate* isolate = args.GetIsolate(); auto message = v8::String::NewFromUtf8(isolate, "I'm a Node Hero!"); args.GetReturnValue().Set(message); }

Because this is going to be a native extension, the v8 namespace is available to use. Note the v8:: notation - which is used to access the v8's interface. If you don’t want to include v8:: before using any of the v8’s provided types, you can add using v8; to the top of the file. Then you can omit all the v8:: namespace specifiers from your types, but this can introduce name collisions in the code, so be careful using these. To be 100% clear, I’m going to use v8:: notation for all of the v8 types in my code.

In our example code, we have access to the arguments the function was called with (from JavaScript), via the args object that also provides us with all of the call related information.

With v8::Isolate* we're gaining access to the current JavaScript scope for our function. Scopes work just like in JavaScript: we can assign variables and tie them into the lifetime of that specific code. We don't have to worry about deallocating these chunks of memory, because we allocate them as if we'd do in JavaScript, and the Garbage Collector will automatically take care of them.

function () { var a = 1; } // SCOPE

Via args.GetReturnValue() we get access to the return value of our function. We can set it to anything we'd like as long as it is from v8:: namespace.

C++ has built-in types for storing integers and strings, but JavaScript only understands it's own v8:: type objects. As long as we are in the scope of the C++ world, we are free to use the ones built into C++, but when we're dealing with JavaScript objects and interoperability with JavaScript code, we have to transform C++ types into ones that are understood by the JavaScript context. These are the types that are exposed in the v8:: namespace like v8::String or v8::Object .

void WhoAmI(const v8::FunctionCallbackInfo<v8::Value>& args) { v8::Isolate* isolate = args.GetIsolate(); auto message = v8::String::NewFromUtf8(isolate, "I'm a Node Hero!"); args.GetReturnValue().Set(message); }

Let’s look at the second method in our file that increments a counter by a provided argument until an upper cap of 10.

This function also accepts a parameter from JavaScript. When you’re accepting parameters from JavaScript, you have to be careful because they are loosely typed objects. (You’re probably already used to this in JavaScript.)

The arguments array contains v8::Object s so they are all JavaScript objects, but be careful with these, because in this context we can never be sure what they might contain. We have to explicitly check for the types of these objects. Fortunately, there are helper methods added to these classes to determine their type before typecasting.

To maintain compatibility with existing JavaScript code, we have to throw some error if the arguments type is wrong. To throw a type error, we have to create an Error object with the

v8::Exception::TypeError() constructor. The following block will throw a TypeError if the first argument is not a number.

if (!args[0]->IsNumber()) { isolate->ThrowException(v8::Exception::TypeError( v8::String::NewFromUtf8(isolate, "Argument must be a number"))); return; }

In JavaScript that snippet would look like:

If (typeof arguments[0] !== ‘number’) { throw new TypeError(‘Argument must be a number’) }

We also have to handle if our counter goes out of bounds. We can create a custom exception just like we would do in JavaScript: new Error(error message’) . In C++ with the v8 api it looks like: v8::Exception:Error(v8::String::NewFromUtf8(isolate, "Counter went through the roof!"))); where the isolate is the current scope that we have to first get the reference via the v8::Isolate* isolate = args.GetIsolate(); .

double argsValue = args[0]->NumberValue(); if (numberOfCalls + argsValue > maxValue) { isolate->ThrowException(v8::Exception::Error( v8::String::NewFromUtf8(isolate, "Counter went through the roof!"))); return; }

After we handled everything that could go wrong, we add the argument to the counter variable that’s available in our C++ scope. That looks like as if it was JavaScript code. To return the new value to JavaScript code, first we have to make the conversion from integer in C++ to v8::Number that we can access from JavaScript. First we have to cast our integer to double with static_cast<double>() and we can pass its result to the v8::Number constructor.

auto currentNumberOfCalls = v8::Number::New(isolate, static_cast<double>(numberOfCalls));

NODE_SET_METHOD is a macro that we use to assign a method on the exports object. This is the very same exports object that we're used to in JavaScript. That is the equivalent of:

exports.whoami = WhoAmI

In fact, all Node.js addons must export an initialization function following this pattern:

void Initialize(v8::Local<v8::Object> exports); NODE_MODULE(module_name, Initialize)

All C++ modules have to register themselves into the node module system. Without these lines, you won’t be able to access your module from JavaScript. If you accidentally forget to register your module, it will still compile, but when you’re trying to access it from JavaScript you’ll get the following exception:

module.js:597 return process.dlopen(module, path._makeLong(filename)); ^ Error: Module did not self-register.

From now on when you see this error you’ll know what to do.

Compiling our native Node.js module

Now we have a skeleton of a C++ Node.js module ready, so let's compile it! The compiler we have to use is called node-gyp and it comes with npm by default. All we have to do is add a binding.gyp file which looks like this:

{ "targets": [ { "target_name": "addon", "sources": [ "example.cc" ] } ] }

npm install will take care of the rest. You can also use node-gyp in itself by installing it globally on your system with npm install node-gyp -g .

Now that we have the C++ part ready, the only thing remaining is to get it working from within our Node.js code. Calling these addons are seamless thanks to the node-gyp compiler. It’s just a require away.

const myAddon = require('./build/Release/addon') console.log(myAddon.whoami())

This approach works, but it can get a little bit tedious to specify paths every time, and we all know that relative paths are just hard to work with. There is a module to help us deal with this problem.

The bindings module is built to make require even less work for us. First, let's install the bindings module with npm install bindings --save , then make a small adjustment in our code snippet right over there. We can require the bindings module, and it will expose all the .node native extensions that we've specified in the binding.gyp files target_name .

const myAddon = require('bindings')('addon') console.log(myAddon.whoami())

These two ways of using the binding is equivalent.

This is how you create native bindings to Node.js and bridge it to JavaScript code. But there is one small problem: Node.js is constantly evolving, and the interface just tends to break a lot! This means that targeting a specific version might not be a good idea because your addon will go out of date fast.

Think ahead and use Native Abstractions for Node.js (NaN).

The NaN library started out as a third party module written by independent individuals, but from late 2015 it became an incubated project of the Node.js foundation.

NaN provides us a layer of abstraction on top of the Node.js API and creates a common interface on top of all versions. It’s considered a best practice to use NaN instead of the native Node.js interface, so you can always stay ahead of the curve.

To use NaN, we have to rewrite parts of our application, but first, let’s install it with npm install nan --save . First, we have to add the following lines into the targets field in our bindings.gyp . This will make it possible to include the NaN header file in our program to use NaN’s functions.

{ "targets": [ { "include_dirs" : [ "<!(node -e \"require('nan')\")" ], "target_name": "addon", "sources": [ "example.cc" ] } ] }

We can replace some of the v8’s types with NaN’s abstractions in our sample application. It provides us helper methods on the call arguments and makes working with v8 types a much better experience.

The first thing you’ll probably notice is that we don’t have to have explicit access to the JavaScript’s scope, via the v8::Isolate* isolate = args.GetIsolate(); NaN handles that automatically for us. Its types will hide bindings to the current scope, so we don’t have to bother using them.

#include <nan.h> const int maxValue = 10; int numberOfCalls = 0; void WhoAmI(const Nan::FunctionCallbackInfo<v8::Value>& args) { auto message = Nan::New<v8::String>("I'm a Node Hero!").ToLocalChecked(); args.GetReturnValue().Set(message); } void Increment(const Nan::FunctionCallbackInfo<v8::Value>& args) { if (!args[0]->IsNumber()) { Nan::ThrowError("Argument must be a number"); return; } double argsValue = args[0]->NumberValue(); if (numberOfCalls + argsValue > maxValue) { Nan::ThrowError("Counter went through the roof!"); return; } numberOfCalls += argsValue; auto currentNumberOfCalls = Nan::New<v8::Number>(numberOfCalls); args.GetReturnValue().Set(currentNumberOfCalls); } void Initialize(v8::Local<v8::Object> exports) { exports->Set(Nan::New("whoami").ToLocalChecked(), Nan::New<v8::FunctionTemplate>(WhoAmI)->GetFunction()); exports->Set(Nan::New("increment").ToLocalChecked(), Nan::New<v8::FunctionTemplate>(Increment)->GetFunction()); } NODE_MODULE(addon, Initialize)

Now we have a working and also idiomatic example of how a Node.js native extension should look like.

First, we’ve learned about structuring the code, then about compilation processes, then went through the code itself line by line to understand every small piece of it. At the end, we looked at NaN’s provided abstractions over the v8 API.

There is one more small tweak we can make, and that is to use the provided macros of NaN.

Macros are snippets of code that the compiler will expand when compiling the code. More on macros can be found in this documentation. We had already been using one of these macros, NODE_MODULE , but NaN has a few others that we can include as well. These macros will save us a bit of time when creating our native extensions.

#include <nan.h> const int maxValue = 10; int numberOfCalls = 0; NAN_METHOD(WhoAmI) { auto message = Nan::New<v8::String>("I'm a Node Hero!").ToLocalChecked(); info.GetReturnValue().Set(message); } NAN_METHOD(Increment) { if (!info[0]->IsNumber()) { Nan::ThrowError("Argument must be a number"); return; } double infoValue = info[0]->NumberValue(); if (numberOfCalls + infoValue > maxValue) { Nan::ThrowError("Counter went through the roof!"); return; } numberOfCalls += infoValue; auto currentNumberOfCalls = Nan::New<v8::Number>(numberOfCalls); info.GetReturnValue().Set(currentNumberOfCalls); } NAN_MODULE_INIT(Initialize) { NAN_EXPORT(target, WhoAmI); NAN_EXPORT(target, Increment); } NODE_MODULE(addon, Initialize)

The first NAN_METHOD will save us the burden of typing the long method signature and will include that for us when the compiler expands this macro. Take note that if you use macros, you’ll have to use the naming provided by the macro itself - so now instead of args the arguments object will be called info , so we have to change that everywhere.

The next macro we used is the NAN_MODULE_INIT which provides the initialization function, and instead of exports, it named its argument target so we have to change that one as well.

The last macro is NAN_EXPORT which will set our modules interface. You can see that we cannot specify the objects keys in this macro, it will assign them with their respective names.

That would look like this in modern JavaScript:

module.exports = { Increment, WhoAmI }

If you’d like to use this with our previous example make sure you change the function names to uppercase, like this:

'use strict' const addon = require('./build/Release/addon.node') console.log(`native addon whoami: ${addon.WhoAmI()}`) for (let i = 0; i < 6; i++) { console.log(`native addon increment: ${addon.Increment(i)}`) }

For further documentation refer to Nan’s Github page.

Example Repository

I’ve created a repository with all the code included in this post. The repository is under GIT version control, and available on GitHub, via this link. Each of the steps have their own branch, master is the first example, nan is the second one and the final step’s branch is called macros.

Conclusion

I hope you had as much fun following along, as I’ve had writing about this topic. I’m not a C/C++ expert, but I’ve been doing Node.js long enough to be interested in writing my own super fast native addons and experimenting with a great language, namely C.

I’d highly recommend getting into at least a bit of C/C++ to understand the lower levels of the platform itself. You’ll surely find something of your interest. :)

As you see it’s not as scary as it looks at first glance, so go ahead and build something in C++, and tweet about it using @risingstack if you need help from us, or drop a comment below!

In the next part of the Node.js at Scales series, we'll take a look at Advanced Node.js Project Structuring.