In the last article of this series titled Go Goroutines vs Node Cluster & Worker Threads — Part 1, I compared the performance of vanilla Golang (Go) and Node.js (Node) HTTP servers returning an “OK” string. In this benchmark, Go outperformed Node, handling 1.1X more requests per second.

In Part 2 of this series and as part of my evaluation criteria for migrating from Node to Go, I wanted to see by how much, Go would outperform Node when performing CPU intensive work. This is something the majority of articles state that Node is very bad at compared to Go. We shall see for ourselves.

To conduct this benchmark, we will use the same HTTP servers as the last article. In addition, I created a function called bench, which executes a number of algorithms in the following sequence.

Execute bench and pass in 100 (bench(100). Bench iterates 100 times and for each iteration, the following steps are performed. Calculate the Fibonacci Sequence for the value 40 and put each sequence into a slice/array. Create a new slice/array of only the prime numbers from the previous step, in a reversed order. Create a new slice/array that is 10 times the length of the previous step, using the values from the previous step, while incrementing the values. Sort the result of the previous step using the bubble algorithm . Create a new slice/array, converting all values to strings. Add the last index of each slice/array to a new slice/array for each iteration and return that slice/array.

Links to the actual code can be found at the end of this article.

My benchmark environment is the same as my last article and consists of the following.

Development PC (HTTP Servers)

Ubuntu 19.04

Intel i7–5960x @3.00GHZ

CPU cores 8

CPU threads 16

64 GB Corsair Vengeance RAM

Sapphire Vega 64 GPU

Benchmark PC (wrk)

Ubuntu 19.04

Intel i7–2600K Processor

16 GB RAM

Single CPU Thread

Before I get into the results of Go and Node maxing out all available CPU threads, here are the results of Go being set to a single CPU thread (runtime.GOMAXPROCS(1)) and Node running on a single process. The same as the last article.

I’m using wrk on my benchmark PC using the following command, with 3, 5 minute benchmarks each. My numbers are the averages of the 3 benchmarks. This is the same as the last article.

Go: 89,580 total requests over 5 min, 298.00 r/s

Node (Single Process): 132,580 total requests over 5 min, 441.80 r/s

Node outperformed Go by 1.48X

16 CPU Threads

In this benchmark, both Go and Node use all of the 16 available CPU threads. Go spawns a Goroutine for each request by default and for Node, the cluster module was used for one benchmark and the worker threads module was used for the other.

I’m using the same wrk command as above.

I’d like to point out that a Goroutine can be created by simply add go to the front of a function call.

go doThis()

Using the cluster module in Node requires < 12 lines of code to implement. The worker threads module requires more effort and to use it effectively, a worker pool should be maintained. From the Node documentation.

The above example spawns a Worker thread for each parse() call. In actual practice, use a pool of Workers instead for these kinds of tasks. Otherwise, the overhead of creating Workers would likely exceed their benefit.

I’ve been working on my own worker pool module, however, for this benchmark, I used node-worker-threads-pool library, which performed well.

Go: 732.204 total requests over 5 min, 2,439.9 r/s

Node (cluster module): 1,116,477 total requests over 5 min, 3,720.65 r/s

Node (worker threads): 1,105,513 total requests over 5 min, 3,683.90 r/s

Source Code

Go (Single & 16 CPU Threads): https://gist.github.com/danielcasler/b918e27099b2186bf5a83206bc838e94

Go Bench Package: https://gist.github.com/danielcasler/2dd7fde3876567f90fbc1420a94c5689

Node (single process): https://gist.github.com/danielcasler/cb9879f3a2149c2fbba1b6795f7ca401

Node (cluster module): https://gist.github.com/danielcasler/75154ee728c9068b06ad2460fb2661bd

Node (worker threads):

https://gist.github.com/danielcasler/afac2d5a0a9b1ac260a0b62d48327c10https://gist.github.com/danielcasler/769f27865f2eaec48f79b173a4382f2d

Node Bench Module: https://gist.github.com/danielcasler/e01da97e09225546728182c491f99eff

Conclusion

I was very surprised by Node’s performance. I did not expect Node to outperform Go. If we were comparing Go to Node, with Node running as a single process, Go would have outperformed Node by 5.52 times. This is how most of the benchmarks showcase Go. However, with the worker threads module, Node outperformed Go by 1.5 times and with the cluster module, Node outperformed Go by 1.52 times.

I’d like to point out that I am not a professional benchmarker and this type of benchmark (combined with the others) was sufficient as part of my decision making process of evaluating a migration to Go. For your use case, you may see this benchmark as being inadequate.

In the next part of this series, I will benchmark both Go and Node when it comes to their respective crypto libraries. Following that, I will do an I/O benchmark to see how Go and Node perform when performing network/database requests.