April 22, 2019











The Node.js environment and JavaScript is often identified with running just one task at a time, but this is not precisely the case. In this article, we go through the concept of multiple processes and spawn them using the child process module. Along the way, we learn how Node.js works under the hood in terms of concurrency, and find out if Node.js is single-threaded.

Is Node.js single-threaded?

The interesting thing is that even though the Node.js is single-threaded by nature, most of the Input / Output operations work in separate threads. Don’t take my word for it, let’s check it!

To do it, we start by creating a big file. Since I run Linux, I use the dd command to copy the null character from the /dev/zero file:

1 dd if = / dev / zero of = file .txt count = 400 bs = 1048576

With the above command, we create 400 blocks of 1048576 bytes: 400*1048576 gives us 400MB of data.

1 2 3 4 5 import { readFile } from 'fs' ; readFile ( './file.txt' , ( ) = > { process . exit ( ) ; } ) ;

That takes my laptop a noticeable fraction of a second to perform that task. Since it is a bit an absorbing job to do, some might ask if the rest of our code might have to wait.

The answer is: not quite!

1 2 3 4 5 6 7 8 9 10 11 12 import { readFile } from 'fs' ; console . time ( 'reading file' ) ; readFile ( './file.txt' , ( ) = > { console . timeEnd ( 'reading file' ) ; process . exit ( ) ; } ) ; let i = 0 ; setInterval ( ( ) = > { console . log ( i ++ ) ; } , 10 ) ;

We dive deeper in the setInterval function, as well as setTimeout , setImmediate and process.nextTick in the previous part of the series: The Event Loop in Node.js

A few peculiar things happen in the code above. Even though we keep our process busy with reading the file, we also print a number every ten milliseconds. Let’s run it!

1

2

…

87

88

reading file: 907.472ms

If we take a look at the output, we can see that even though we are in the process of reading a file, the event loop still works and our callback is periodically invoked throughout the whole time.

The reading of the file is not interrupted to print out the numbers – it runs in a separate, parallel thread. The conclusion is that even though the Node.js is single-threaded by nature, some actions take place in separate threads.

The library that handles asynchronous I/O is libuv and has a defined set of threads called the Worker Pool. The default number of threads is 4, but you can change it by setting the UV_THREADPOOL_SIZE environment variable.

Node.js recently takes steps into developing tools for multithreaded developent called Worker Threads, but we will cover them in the upcoming parts of the series

Child processes

Even if we assume that a Node.js process is single-threaded, we can still create more than one. To make a new process, we can use the child process module: we can control its input stream and listen to its output stream.

The first function from the child process module that we cover here is called spawn. With it, we can execute any operating system command in a separate process.

To list all the contents of a directory in a bash shell (for example on Linux or Mac) you can use the ls command. To do a similar thing in the Command Prompt on Windows, use dir.

1 2 3 import { spawn } from 'child_process' ; const child = spawn ( 'ls' ) ;

Standard input/output streams of a child process

By calling spawn, we create an instance of a ChildProcess. Some of the things we get access from it are standard input/output streams: stdin, stdout, and stderr.

If you want to know more about stdio streams check out the 5th part of this series: Writable streams, pipes and the process streams

To read upcoming data, we can use the fact that all streams are event emitters and listen for the “data” event.

1 2 3 4 5 6 7 import { spawn } from 'child_process' ; const child = spawn ( 'ls' ) ; child . stdout . on ( 'data' , ( data ) = > { console . log ( data . toString ( ) ) ; } ) ;

file.txt

main.ts

node_modules

package.json

package-lock.json

tsconfig.json For more information on how to read data from a stream, check out 4th part of the series: Paused and flowing modes of a readable stream

To pass additional arguments to the command, we can give them as an array to the spawn function.

1 2 3 import { spawn } from 'child_process' ; const child = spawn ( 'ls' , [ './node_modules' ] ) ;

Let’s say we want to count all the characters in our file that we create in the previous paragraph: it weights 400MB, so it is quite a demanding task. To perform it, we use the wc -c command in a separate process and pass the file to the stdin stream.

1 2 3 4 5 6 7 8 9 10 11 import { spawn } from 'child_process' ; import { createReadStream } from "fs" ; const readableStream = createReadStream ( './file.txt' ) ; const wc = spawn ( 'wc' , [ '-c' ] ) ; readableStream . pipe ( wc . stdin ) ; wc . stdout . on ( 'data' , ( data ) = > { console . log ( ` Number of characters : $ { data } ` ) ; } ) ;

Number of characters: 419430400

exec() and execFile()

Besides spawn, the child process module also has the exec function. It serves a similar purpose but has some major differences. The first one is that it creates a shell, so you can type in a full command that you want to execute.

1 2 3 4 5 6 7 import { spawn , exec } from 'child_process' ; spawn ( 'ls | grep .txt' ) ; // throws an error exec ( 'ls | grep .txt' , ( error , response ) = > { console . log ( response ) ; } ) ;

In the code above you can see an attempt of passing ls | grep .txt to both spawn and exec, but the first one fails. This is due to the fact that we don’t have the advantages of a fully functional shell and therefore can’t pipe the result of ls to grep. The second major difference that you can spot is that exec works with callbacks instead of streams.

If you want to use the shell, but still work with streams, you can pass shell: true to the spawn function.

1 2 3 4 5 6 7 8 9 import { spawn } from 'child_process' ; const child = spawn ( 'ls | grep .txt' , { shell : true } ) ; child . stdout . on ( 'data' , ( result ) = > { console . log ( 'Result:' , result . toString ( ) ) ; } ) ;

Resut: file.txt

The execFile function acts a bit like a combination of both spawn and exec: it does not create a shell but operates with callbacks instead of streams:

1 2 3 4 5 import { execFile } from 'child_process' ; execFile ( 'ls' , ( error , result ) = > { console . log ( result ) ; } ) ;

Don’t get deceived by its name: spawn, and exec can execute files too. Just make sure that the file you want to run has permissions to execute. On Linux, you can do it by using the chmod command:

1 chmod + x . / script .sh

1 2 3 4 5 import { spawn , exec , execFile } from 'child_process' ; spawn ( './script.sh' ) ; exec ( './script.sh' ) ; execFile ( './execFile.sh' ) ;

Keep in mind that not using the shell is a bit better in terms of performance.

Child process events and communication

Aside from providing us with stdio streams, the child process also emits events. One of the events worth mentioning is the exit event – it is emitted when the child process ends. The disconnect event signals that the spawned process calls the disconnect() function. The child process emits the error event when the process failed to spawn, could not be killed or sending a message to it is unsuccessful.

Above we mention sending a message to the process. This is due to the fact, that besides using the spawn function we also have the fork function that spawns new Node.js processes.

main.ts

1 2 3 import { fork } from 'child_process' ; const child = fork ( './child.ts' ) ;

child.ts

1 console . log ( 'Hello world!' ) ;

After running the code above, we see the Hello world! in the console. An important observation here is that the forked process inherits the process.execArgv property: if you use TypeScript, it contains the path to the ts-node executable. In result, the forked processes run with TypeScript too.

main.ts

1 2 3 4 5 import { fork } from 'child_process' ; const child = fork ( './child.ts' ) ; console . log ( process . execArgv ) ;

child.ts

1 console . log ( process . execArgv ) ;

[ ‘/home/marcin/Documents/projects/node-playground/node_modules/ts-node/dist/bin.js’ ]

[ ‘/home/marcin/Documents/projects/node-playground/node_modules/ts-node/dist/bin.js’ ]

To communicate with it we can take advantage of the message event. The process emits it when you use the process.send() function to send a message.

Let’s say that we want to calculate the factorial of a number recursively. Depending on how significant the number is, it might need some computing power.

child.ts

1 2 3 4 5 6 7 8 9 10 11 function factorial ( n : number ) : number { if ( n === 1 || n === 0 ) { return 1 ; } return factorial ( n - 1 ) * n ; } process . on ( 'message' , ( n : number ) = > { process . send ( factorial ( n ) ) ; process . disconnect ( ) ; } ) ;

main.ts

1 2 3 4 5 6 7 8 9 import { fork } from 'child_process' ; const child = fork ( './child.ts' ) ; child . send ( 20 ) ; child . on ( 'message' , ( message : number ) = > { console . log ( 'Result: ' , message ) } ) ;

Result: 2432902008176640000

In the code above you can see that as soon we send back the result in the child process, we call the process.disconnect(). Thanks to that, the main process can stop listening for events and can come to an end. Otherwise, our primary process would endlessly wait for messages from the child.

We can make the code a bit more readable and create a function returning a promise.

main.ts

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 import { fork } from 'child_process' ; factorial ( 20 ) . then ( ( result ) = > { console . log ( 'Result: ' , result ) ; } ) . catch ( ( ) = > { console . log ( 'An error occured' ) ; } ) ; function factorial ( n : number ) { return new Promise ( ( resolve , reject ) = > { const child = fork ( './child.ts' ) ; child . send ( n ) ; child . on ( 'message' , ( result : number ) = > { resolve ( result ) } ) ; child . on ( 'error' , ( ) = > { reject ( ) } ) ; } ) }

Thanks to handling the extensive task in a separate process, our event loop is not blocked.

Summary

Multiple processes can work in parallel, but it is not yet multithreading programming in the conventional sense. While processes can communicate with each other, they can’t share any data that they work on: to do that we need to use the new feature of Node.js: worker threads. We will explore this concept in the upcoming parts of the series. Meanwhile, I believe the knowledge of how child processes work can change your approach to programming in Node.js a bit.