Blocking I/O, Non-Blocking I/O

IO refers to interaction with the OS disk and network.

In blocking IO programming, a function call that accesses an IO in the OS blocks the execution of the thread and leaves the system resources idle, until the IO operation completes. This code. Let’s look at a blocking code:

Now, most OS are multi threaded, accessing IO can become non-blocking. In this non-blocking IO operation, the call to IO resource returns without waiting for it to complete. It constantly queries the kernel for the completion mode of the resource and whenever there is data available it returns.

Event Demultiplexer

The above non-blocking IO implementation is not the ideal solution. Many modern OSes like Windows, Unix, BSD systems have their implementations of event demultiplexer to handle concurrent, non-blocking resources in an efficient way.

Event demultiplexer is a synchronous non-blocking notification system.

Event Demultiplexer APIs in different Operating Systems:

kqueue (BSD, OSX)

epoll (Linux)

event ports (Solaris, SunOS)

IOCP GetQueuedCompletionStatusEx (Windows)

The Reactor Pattern

Reactor pattern is a pattern of synchronous demultiplexing and invocation of events handler in the order they arrive.

This is what happens in an application that uses reactor pattern:

The application makes an IO request. This request is delegated to the Event Demultiplexer which sends it to the appropriate hardware. When the IO operations complete, Event demultiplexer pushes the associated callback to the Event Queue. The event queue loops through the list and executes them in the order they were inserted until the list exhausts. When the event queue is empty, the loop will block on the Event Demultiplexer and triggers another cycle 1. . If the event queue and the Event Demultiplexer pending requests queue is empty the program exits.

Event Queue

Event queue is a data structure where callbacks associated with an async IO operation are enqueued to be executed sequentially by the Event Loop until the queue is empty.

Event queue is composed of different queues each associated with an IO event. In Node.js, there are:

timers queue : Expired timers and due interval callbacks are enqueued here.

: Expired timers and due interval callbacks are enqueued here. poll queue : Completed IO events are enqueued here.

: Completed IO events are enqueued here. check queue : setImmediate s callback are enqueued here.

: s callback are enqueued here. pending queue : Completed/Errored IO events are here.

besides these four queues, there are other two queues in Node.js

nextTick queue : queued by process.nextTick()

: queued by microtask queue : queued by resolved Promise s.

Will get into these queues in the later sections.

libuv: Non-Blocking IO Engine of Node.js

Concurrency is a way to structure a thing so that you can, maybe, use parallelism to do a better job. But parallelism is not the goal of concurrency; concurrency’s goal is a good structure. — R. Pike, Golang co-inventor

We saw earlier different API in different Operating Systems used for the Event Demultiplexer. To support Event Demultiplexer on diff systems, the Nodejs group developed a library to provide a high-level abstraction and make Node.js compatible with all the major platforms. The library is called libuv

libuv is cross-platform support library which was originally written for NodeJS. It’s designed around the event-driven asynchronous I/O model. The library provides much more than a simple abstraction over different I/O polling mechanisms: handles and streams provide a high-level abstraction for sockets and other entities; cross-platform file I/O and threading functionality is also provided, amongst other things.

We now see that the reactor pattern is the building block of Node.js. Gathering it all together, Node.js becomes a collection of utilities:

v8 : A powerful and high-performance JavaScript engine from Google.

: A powerful and high-performance JavaScript engine from Google. libuv : Library for async IO operations.

: Library for async IO operations. zlib: for compression, http_parser: for HTTP requests and responses, openssl: for HTTPS security

From what we stated above, libuv provides event loop for Node.js. Let’s look at the Event loop deeper.

What is Event Loop?

Event loop is the system in Nodejs that allows it to perform non-blocking IO operations. Non-blocking IO operations, what does it mean? Answer for all these lies at the heart of NodeJS, the Event Loop. Since most OS are multithreaded, they handle multiple operations or programs running parallel to each other. Nodejs delegates any IO operations to the OS, and waits for the completion status of the operation and executes its callback when due.

When does Event Loop run?

When a script is run in Node.js, like this:

Node.js initializes Event Loop, and processes the provided script, in our case, it is script.js . This script may make async calls via async APIs like:

fs.readFileAsync

setTimeout, setInternal

setImmediate, etc

These APIs enqueue tasks in the event queue, so after processing the script, Node.js calls the Event Loop.

Let’s look at the phases in the Event Loop.

Each box represents a phase in the Event loop. We have the timers phase, pending callbacks phase, idle, prepare phase, poll phase, check phase and close phase.

These phases have a FIFO policy. The first callback queued is first dequeued and executed.

When the event loop enters a phase, it iterates through the queue in the phase and proceeds to each the callbacks until the queue is exhausted or the max limit for the phase is reached. When either of these occurs the event loop proceeds to the next phase and does the same thing all over again.

The Event Loop exits when all the queues in the phases are all empty.

timers : This is scheduled by setTimeout , setInterval .

: This is scheduled by , . pending callbacks : Any completed,errored IO operation is queued here.

: Any completed,errored IO operation is queued here. idle , prepare : These are used internal operations of Node.

, : These are used internal operations of Node. poll : Node will block here for a specified amount of time to wait for any completed IO operation.

: Node will block here for a specified amount of time to wait for any completed IO operation. check : It executes callbacks scheduled by setImmediate .

: It executes callbacks scheduled by . close: It runs all callbacks queued by any closed IO operation, eg. *.on('close', callback) , socket.destroy() .

Let’s get into more detail about the timer and poll phases and see their implementation in libuv. These two are the most important phases in the event loop, and critical decisions are made in their phases.

Timers

This is the first phase in the event loop. This phase executes all callbacks scheduled by setInterval or setTimeout APIs. The timers provide a threshold after which the callback is executed.

Here, the callback function is executed after 100ms has elapsed. When the program is run node script.js , Node bootstraps itself and executes our provided script.js . This runs setTimeout which enqueues the callback. After executing the script, Node runs the event loop, which executes the callback function and we see setTimeout on the terminal.

Though, the callback is not immediately run. Event loop checks if the threshold set has elasped, if not, it runs other phases and cycles back for another pass tick .

Let’s demonstrate with this example:

We kinda simulated the timers phase. I created a loop object, which contains everything needed for the event loop. That the same thing done on libuv there is a central uv_loop_t struct. It contains all the phases.

Our loop has a time property which holds the current time, this is updated at the start of a cycle. There is the timer_heap which holds the setTimeouts callbacks in an array. Though, this not how it is implemented in libuv. In libuv, timers are structured in a linked list in a heap_node struct:

left points to its previous heap_node, right points to the next node inline. There is no pushing and popping.

But, our implementation provides the same results as above. Next, we have the enqueue property, which pushes a timer to the timer_heap array and sorts them with the lowest timer being first. min follows which is used to get the timer with the nearest threshold since the timer_heap is already sorted, it just returns the 0 index timer.

Next, we began implementing some functions, we would find in the libuv.

uv_timer_stop dequeues the nearest timer, uv_update_time updates the loop.time with the current time.

uv_run_timers is the function that runs the timers phase. It begins with an infinite for-loop, inside the loop, it gets the nearest timer if there is no timer in the timer_heap it breaks off the loop. But if there is, it checks if the timer has expired, if yes, it dequeues the timer and runs its callback but if not, it breaks off.

Let’s look at what libuv has:

Our codes and implementation are similiar.

uv_loop_alive function simply checks if the timer_heap is empty. There is more to this, we only cheked the timer_heap b'cos it is the only phase we have. In libuv.

uv_run function is the Event Loop. It runs all the phases and surmises when to exit. In our implementation, it only runs the timer phase. It runs all the phases in a while-loop, at each loop, it runs all phases and calls the uv_loop_alive to know whether to break off and exit. In libuv, it's quite bigger. We'll come to it later.

Next, we monkey-patched the setTimeout function and set it up with our own implementation. It calls the loop’s enqueue function on execution, to push its callback function cb and threshold ms to the timer_heap. The real setTimeout does the same thing though, in a different way but with the same result as ours.

We call the setTimeout function with different callbacks and timeouts and kick off the event loop with uv_run .

So we have simulated the timer phase in JavaScript. If we run our script, it will yield this:

Same as it would, with the real setTimeout.

Calculation of time and threshold expiration vary on hardware and OS. So, this timer phase is a little bit buggy you can say.

Now, let’s poll a bit :).

Poll

Polling watches for new connections, requests, data, etc. Generally, that’s its job.

Looking at Node.js article about the poll phase, they wrote this:

But, looking at the libuv sources, the implementations doesn’t perfectly correspond to what is above.

The thing is that poling happens differently on different OSes. Like we listed in the reactor section,

kqueue (BSD, OSX)

epoll (Linux)

event ports (Solaris, SunOS)

IOCP GetQueuedCompletionStatusEx (Windows)

Linux, BSD, OSX, Solaris, SunOS except Windows are built with Unix policy. They behave alike but have different implemetations. In libuv, the polling mechanism is carried out by the uv__poll function. libuv has two folders win and unix , win contains implementation for Windows systems and unix for Unix-based OSes. libuv detects which platform or OS it currently running on to know which folder to include .

Let’s look at the Windows and Unix implementations:

Windows

The above code is event loop implementation for Windows. You see it clearly represents the event loop diagram we saw earlier.

uv__run_timers : runs the timers phase; checks for expired timers.

: runs the timers phase; checks for expired timers. uv_process_reqs : processes pending requests.

: processes pending requests. uv_idle_invoke : run the idle phase.

: run the idle phase. uv_prepare_invoke : runs the prepare phase.

: runs the prepare phase. uv__poll : polls for completed IO operations.

: polls for completed IO operations. uv_check_invoke : runs the callbacks set by setImmediate.

: runs the callbacks set by setImmediate. uv_process_endgames : runs the close phase.

In unix, it is different, uv_process_reqs is absent, it is replaced by uv__run_pending(loop) . Here, in Win, there is no pending queue, pending requests from the uv__poll is processed there.

Let’s look at uv__poll imple.:

You see here, the function enters an infinite for-loop. It calls the GetQueuedCompletionStatusEx(...) Windows API (b'cos it is running in a Windows environ, it won't be available on Unix-systems).

GetGetQueuedCompletionStatusEx attempts to dequeue an IO completion packet from the IO completion port. If no completion packet was found, it waits for some time for a pending IO packet associated with the IO port to complete.

CompletionPort : the port where the IO packet should appear upon completion. To create a port, use CreateIoCompletionPort API.

lpNumberOfBytes : holds the number of bytes transfered by the IO operation that has completed.

lpCompletionKey : holds they key-value of the completed IO packet.

lpOverlapped : holds the address of the OVERLAPPED structure that was passed in lpNumberOfBytes when the completed IO operation was started.

dwMilliseconds : The amount of time to wait for a completed IO packet to appear at the completion port. If the specified time expires without a packet appearing at the completion port, the function returns false. If the time is infinite -1 , the function blocks to infinity. if the time is zero and there is no packet at the port, the function times out immediately.

loop->iocp was passed in the CompletionPort arg. The iocp was was created when libuv initialized in uv_loop_init :

Next, the number of bytes and completion key was passed and the pointer to where the data from the completion packet is to be stored.

count recieves the address of the OVERLAPPED structure that was specified when the completed I/O operation was started.

timeout is the amount time the GetQueuedCompletionStatusEx needs to polls for completed data.

The boolean success variable holds the result of the operation. If the function returns true , some IO packet was dequeued, it loops through the count variable(it points to the address of the overlappeds variable which holds the number of bytes transfered during the completed IO operation) and for each dequeued package it kinda converts it to uv_req_t struct and inserts it into the pending requests queue. These will be called in the next event loop cycle in uv_process_reqs(...) .

This is the most important part in uv__poll the rest of the code is quite straightforward. It updates the loop time and breaks off.

I know this might too much, but we have to dig into the sources to understand this clearly. The best bet for understanding a concept is to look into the sources. Just tag along it will be all over soon.

Unix

Let's look at the implementation in libuv:

It's similiar to what is in Windows. Here, the poll phase has its queue and pending phase has also its queue. Remember in windows, the IO operations polled in the poll are executed in the pending phase. Let's look at the uv__io_poll function.

Linux

Linux uses epoll_wait() to wait for events on an epoll instance. This function blocks until any of the descriptors being monitored becomes ready for I/O.

The signature of epoll_wait is:

int epoll_wait(

int epfd,

struct epoll_event *events,

int maxevents,

int timeout

);

epfd : This holds the file descriptor to which the events from it is to be polled.

events : This an array of epoll_event structures, it holds the events that are in the ready state.

maxevents : The maximum number of events to poll. It must be greater than 0.

timeout : The amount of time to poll for completed events.

if timeout is -1 , epoll_wait will block indefinitely. But when the any of the events for the file descriptor epfd becomes ready it will break off and return.

if timeout is 0 , epoll_wait will check for any ready event and return immediately.

The return value of epoll_wait is an integer.

if an error occured it returns -1 . if the time expires and there are no ready events, it returns 0 . if the time expires and there are ready events in the file descriptor epfd , it returns the number of the completed events in epfd .

Now, we have seen epoll_wait in its entirety. Let's look at the uv__io_poll .

It loops through a watcher_queue , watcher_queue contains a list of all file descriptors that have an event that needs to be listened for.

It adds the file descriptor to the backend_fd using an epoll_ctl function.

Next, calls epoll_wait with the supplied timeout. The nfds integer variable holds the result of the operation. It loops through the completed events and calls their callbacks.

BSDs, MacOSX

These OSes use the kqueue function. It has the exact implementation as Linux systems

The difference is where Linux uses epoll_wait , kevent is in place there.

On initialization, the uv__kqueue_init function creates a new kernel event queue and returns a descriptor loop->backend_fd by calling the kqueue function.

// unix/kqueue.c

int uv__kqueue_init(uv_loop_t* loop) {

loop->backend_fd = kqueue();

//...

}

Now with this event queue, the kevent function is called to register events with the queue, and return any pending events to the user.

So, in the uv__io_poll libuv calls kevent

//...

nfds = kevent(loop->backend_fd,

events,

nevents,

events,

ARRAY_SIZE(events),

timeout == -1 ? NULL : &spec);

//...

to return the number of completed events registered on the file descriptor backend_fd 's event queue and run their callbacks.

Let's look at their signatures:

int kqueue(void);



int kevent(

int kq,

const struct kevent *changelist,

int nchanges,

struct kevent *eventlist,

int nevents,

const struct timespec *timeout

);

kqueue takes no parameters. Its return value is an integer, a positive integer if it successfully created the keveent queue, or -1 if the creation failed.

kevent registers events with the new queue and returns pending events.

kq : The number of the file descriptor's event queue.

changelist : pointer to an array of kevent structures.

nchanges : the size of changelist.

eventlist : pointer to an array of kevent structures.

nevents : determines the size of eventlist.

timeout : The time kevent will poll for pending events on kq.

Now, we have seen different implementations of uv_io_poll for different OSes. The uv_io_poll takes in a second parameter timeout after the uv_loop_t structure. This is the time each OS poll functions should block for IO. If the time is zero, the poll function returns immediately. How this timeout is calculated is an intresting part.

The timeout is calculated by uv_backend_timeout .

int uv_backend_timeout(const uv_loop_t* loop) {

if (loop->stop_flag != 0)

return 0;



if (!uv__has_active_handles(loop) && !uv__has_active_reqs(loop))

return 0;



if (!QUEUE_EMPTY(&loop->idle_handles))

return 0;



if (!QUEUE_EMPTY(&loop->pending_queue))

return 0;



if (loop->closing_handles)

return 0;



return uv__next_timeout(loop);

}

if the stop_flag is set that equals 1 therefore, poll exits.

is set that equals therefore, poll exits. if there is no active handles and no active requests, then, poll shouldn't wait for anything, it just exits.

if the idle queue is not empty, IO poll should not wait.

if pending requests queue is not empty, poll should not wait.

if there are closing handles, poll should not wait.

if none of the conditions are met, then uv__next_timeout is called to determine the timeout value. Onto its implementation:

int uv__next_timeout(const uv_loop_t* loop) {

const struct heap_node* heap_node;

const uv_timer_t* handle;

uint64_t diff;



heap_node = heap_min(timer_heap(loop));

if (heap_node == NULL)

return -1; /* block indefinitely */



handle = container_of(heap_node, uv_timer_t, heap_node);

if (handle->timeout <= loop->time)

return 0;



diff = handle->timeout - loop->time;

if (diff > INT_MAX)

diff = INT_MAX;



return diff;

}

We learned earlier that timers are sorted by the nearest timer to expiration. So here, it gets the minimum timer from the timer_heap, if the heap is empty it returns -1, that is infinity .

If the timer has expired, it returns 0 .

If the timer is not yet due, that's still in the future, it gets the difference of the timer and the loop time (the current time) and returns it. If the timer is due 500ms and the loop time is 100ms, therefore IO should poll for 400ms because after the polling the timer will be due by then.

Microtask queue and nextTicks queue

We said in the Event Queue section that we will come to this. Finally, we are here.

These queues are run immediately afer each phase in the event loop.

┌───────────────────────┐

┌──>│ timers │ microtask/nextTick

│ └──────────┬────────────┘

│ ┌──────────┴────────────┐

│ │ pending I/O callbacks │ microtask/nextTick

│ └──────────┬────────────┘

│ ┌──────────┴────────────┐

│ │ idle, prepare │ microtask/nextTick

│ └──────────┬────────────┘

│ ┌──────────┴────────────┐

│ │ poll │ microtask/nextTick

│ └──────────┬────────────┘

│ ┌──────────┴────────────┐

│ │ check │ microtask/nextTick

│ └──────────┬────────────┘

│ ┌──────────┴────────────┐

└───┤ close callbacks │ microtask/nextTick

└───────────────────────┘

nextTicks have a high priority over microtasks. This means that callbacks scheduled by nextTicks are run before the callbacks set by microtasks are run.

nextTicks are scheduled by process.nextTicks(()=>{...}) , etc. microtask by Promises, etc.

Common Misconceptions

There are many misconceptions about Node.js and Event Loop.

Few of the misconceptions are that:

Event loop is inside JS Engine There is a single stack or queue Event loop runs in a separate thread Some async OS api involved in setTimeout

1. The event loop is not inside the JS engine. The event loop is an entity that runs when the provided script has been compiled and run by the v8 compiler.

2. The concept is more of a linked list, not a stack or queue.

3. Event Loop runs in the main thread. There is no separate thread created for the event loop.

Nodejs compiles and runs the script 1. . Then, it processes the Event Loop 2 . It iterates through the Event loop till it is drained. There is no thread created for the event loop cycles. The script compilation by v8 must exit before the event loop executes.

4. There is no OS API for setTimeout. Everything is implemented in Node.js.

Conclusion

I described the Node.js event loop system on a scale hitherto undreamt of. I also dove into Node.js and libuv sources for better clarity and explanation of the concept.

I think with these you can confidently use Node.js because you know how it works its nuts and bolts. In the next post in this series, we will look at the best practices you need in your arsenal as a Node.js developer.

Feel free to ask me anything, if you have any questions concerning this post or if there is anything to be corrected. Feel free to comment! :)

Appreciation

Much thanks to: