Some things in life take time

Not everything completes immediately. Some operations in software take a short while to finish — which presents a very interesting challenge to implement on systems that are designed for serial execution. If you need to access a server over the network, you’ll have to wait until it responds. Since CPUs are designed to run opcodes one after the other without waiting, what do they do in the meantime?

That’s the basis behind asynchronicity and concurrency.

Why not just block?

Suppose we could halt execution and block until the anticipated response arrives. This normally isn’t a good idea because our program will remain unresponsive to anything else going on. If we’re implementing a frontend application — what happens if the user tries to interact with it while we block? If we’re implementing a backend service — what happens if a new request suddenly comes in?

Let’s start pure, with minimal abstractions and low-level API from the likes of the immortal select function. If we don’t want to block, the alternative is returning immediately — or in other words, polling. This also feels wrong, busy-wait never sounds like a good idea.

We need something else. We need abstractions.

Why multi-threading is evil

The traditional solution to this problem has been an abstraction offered on the operating-system level — multi-threading. We want to block, but we don’t want to block the main execution context. So let’s create additional execution contexts that could run in parallel. But what if we only have a single CPU with a single core? That’s where the abstraction comes in — the OS will multiplex and transparently jump between multiple execution threads for us.

This approach is so popular in fact, that the majority of web content on the Internet is served this way. Apache HTTP Server, the world’s most popular web server having over 40% market share, traditionally relies on a separate thread to handle every concurrent client.

The problem with relying on threads to magically solve the concurrency problem is that threads are generally expensive in terms of resources and also introduce significant additional complexity when used.

Let’s start with complexity. Threaded code may seem to be simpler because it can be synchronous and block until things are ready. The problem is that we usually have little control over when one thread stops running and another starts (a context switch). If we have a shared data structure that several threads rely on, we need to be very careful. If one thread starts updating data and is switched from before completing the update, another thread can pick up from an inconsistent state. This problem introduces synchronization mechanisms such as mutexes and semaphores that are never a delight to work with.

The second problem is cost, or more specifically the resource overhead that threads incur. The scheduler is the entity in the OS charged with orchestrating when threads run. The more threads you have, the more time the OS spends on deciding who should run instead of actually running them. Even more serious is the problem of memory. Every thread has an execution stack that usually reserves several MBs of memory, some of which even has to be non-paged (so virtual memory doesn’t necessarily help). This oftens becomes the bottleneck when running large amounts of threads.

These are not theoretical problems, they influence the world around us in some very practical ways. For starters, they contribute to a very poor standard of load acceptable today on the Internet. Ridiculous things like the Reddit hug of death constantly happen because many servers can’t handle more than a few thousands of concurrent connections. This is known as the C10K problem. It’s ridiculous because with a slightly different architecture (not based on threads), these same servers could handle hundreds of thousands of concurrent connections with ease.

So threads are bad — now what?

It’s not really that threads are bad, it’s more that we shouldn’t rely on them as the only concurrency abstraction we have. We must develop abstractions that provide the same freedom of concurrency even on single-threaded systems.

This is why I love Node. JavaScript, due to an unrelated limitation, forces us to work with a single execution thread. This may seem at first like a big drawback of this ecosystem, but is actually a blessing in disguise. If we don’t have the luxury of threads, we must evolve powerful concurrency tools that are not dependent on them.

What if we have more than one CPU or more than one core? How can we make use of them if Node is single-threaded? In this case we can simply run multiple instances of Node on the same machine.

Let’s play with a real-life example

To keep the discussion grounded, let’s take a realistic scenario that we want to implement. Let’s build a service like Pingdom. Given an array of server URLs, our service will “ping” each one by issuing an HTTP request exactly 3 times in intervals of 10 seconds.

The service will return the list of servers that failed to reply and the number of times they didn’t respond properly. There’s no need to ping different servers in parallel, so we’ll process the list one by one. And lastly, while we wait for a server to respond, we won’t block the main execution thread.

We can summarize our entire service by implementing the following pingServers function:

Pseudocode implementation with threads

If we were using multi-threading and allowed ourselves to block, pseudocode of the implementation would have been:

To make sure we don’t accidently rely on threading, in the next sections we’ll implement the service on Node — using asynchronous code.

First approach — callbacks

Node relies on the JavaScript event loop. Since it’s single-threaded, API calls normally don’t block execution. Instead, commands that aren’t completed immediately post an event when they do. We can specify a callback function to run when the event is posted, and place the remainder of our business logic there.

The standard complaint about callbacks is the famous pyramid of doom, where your code ends up looking like an indented mess. My biggest problem with callbacks is actually different and is that they don’t deal well with control flow.

What is control flow? It’s the for loops and if statements that you need to implement basic business logic rules like pinging every server exactly 3 times, and including this server in the result only if it failed. Try using a forEach and setTimeout to implement this logic and you’ll see that it simply doesn’t work as easily with callbacks as you’d think.

So what do we do instead? One of the more flexible ways I know to implement non-trivial control flow with callbacks is building a state machine:

This works but isn’t as straightforward as I’d like. Let’s explore an alternative implementation using an additional dependency — a library dedicated for callback control flow called async:

This is a little better and shorter. Is this straightforward and easy to understand at a quick glance ? I think we can do better.

Second approach — promises

We’re not perfectly happy with the first approach and the way to improve is using a higher level of abstraction. Promises hold “future” values that haven’t necessarily been resolved yet. It’s a placeholder of sorts that is returned immediately, even if the asynchronous action that defines it hasn’t completed. The interesting thing about promises is that they allow us to start working with this future value immediately and keep chaining actions to it that will actually take place in the future when it’s finally resolved.

We’ll change pingServers to return a promise, and alter its usage to:

Most modern asynchronous APIs favor promises to callbacks. In our case, we’ll base our HTTP requests on the Fetch API that is promise-based.

We still have the issue of control flow. How can our simple logic be achieved with promises? In my opinion, functional programming works best with promises, and in JavaScript this usually means pulling out lodash.

If we had wanted to ping the servers in parallel, things would have been quite easy and we could use an operation like map to transform our array of URLs into an array of promises that resolve to the number of failures in each URL. Since we want to ping the servers sequentially, things are a little more tricky. Since each promise needs to be chained to the then of the previous one, we’ll need to pass data between the different iterations. This can be achieved with an accumulator in operations like reduce or transform :

Hmmm.. I have to say this isn’t easy on the eyes either. I actually have a hard time following what goes in there 5 minutes after writing it. To help clarify this mess, I think it’s easier if we split the same exact implementation into two separate smaller functions:

This is a little clearer… but the accumulator still complicates the whole thing.

Third approach — async-await bliss

Come on, all we’re trying to do is ping a few servers in order. The previous two approaches gave us valid implementations, but they weren’t exactly trivial to follow. Why is that? Maybe it’s because us humans tend to find procedural thinking a little more intuitive for business logic.

I’ve first met the async-await pattern while I was doing a side project on Microsoft Azure and learned a little C# and .NET by proxy. I was immediately blown away. This was the best of both worlds — straightforward procedural thinking without the thread block penalties. These guys did an awesome job!

I was delighted to see this pattern seeping into many other languages like JavaScript, Python, Scala, Swift and more.

I think the best introduction to async-await is to simply jump into the code and let is speak for itself:

Now we’re talking. Easy to write, easy to read. What the code is doing is finally obvious from a quick glance. And it’s completely asynchronous. Yay.

I can’t put it any better than the words of Jake Archibald:

They’re brilliant. They’re brilliant and I want laws changed so I can marry them.

Notice how the implementation resembles the synchronous flow we could previously only achieve using threads and blocking. How is it doing it without blocking? There’s a lot of magic happening behind the scenes. I won’t get into it, but the await keyword does not block, it yields execution to other things in the event loop. When the result being awaited on is ready, execution can resume from this point.

In addition, the way to call this version of pingServers is identical to the previous version with promises. An async function returns a promise , making integration with existing code as simple as possible.

Summary

We’ve severed our dependency on threads for concurrency and played with 3 different flavors of asynchronous code. Callbacks, promises and async-await are different abstractions designed for similar purposes. Which one is better? It’s a matter of personal taste.

It’s also nice to see how historically these 3 flavors signify 3 generations of JavaScript. Callbacks ruling in the early days until ES5. Promises prominent in the ES6 era, where JavaScript as a whole took a big step towards modern syntax. And of course, the subject of our celebration — async-await, in the bleeding edge of ES7. It’s a marvelous tool, use it!