Above picture was shamelessly stolen from: http://computer-history.info/Page4.dir/pages/IBM.7030.Stretch.dir/

In this blog post I’m going to follow suit on my threading models post (here) and talk about different types of I/O, how they work, and when you might want to consider using them. Much like with threading models, I/O models have terminology which can be confusing. The confusion leads to misconceptions which will hopefully be cleared up here.

Let’s start first by going over some operating system basics.

System Calls

A system call is a common interface which allows user applications and the operating system kernel to interact with one another. Some familiar functions which are system calls: open(), read(), and write(). These are system calls which ask the kernel to do I/O on behalf of the user process.

There is a cost associated with making system calls. In Linux, system calls are implemented via a software interrupt which causes a privilege level change in the processor – this switch from user to kernel mode is commonly called a context-switch.

User applications typically execute at the most restricted privilege level available where interaction with I/O devices (and other stuff) is not allowed. As a result user applications use system calls to get the kernel to complete privileged I/O (and other) operations.

Synchronous blocking I/O

This is the most familiar and most common type of I/O out there. When an I/O operation is initiated in this model (maybe by calling a system call such as read(), write(), ioctl(), …), the user application making the system call is put into a waiting state by the kernel. The application sleeps until the I/O operation has completed (or has generated an error) at which point it is scheduled to run again. Data is transferred from the device to memory and possibly into another buffer for the user-land application.

Pros:

Easy to use and well understood

Ubiquitous

Cons:

Does not maximize I/O throughput

Causes all threads in a process to block if that process uses green threads

This method of I/O is very straight forward and simple to use, but it has many downsides. In a previous post about threading models, I mentioned that doing blocking I/O in a green thread causes all green threads to stop executing until the I/O operation has completed.

This happens because there is only one kernel context which can scheduled, so that context is put into a waiting state in the kernel until the I/O has been copied to the user buffer and the process can run again.

Synchronous non-blocking I/O

This model of I/O is not very well known compared to other models. This is good because this model isn’t very useful.

In this model, a file descriptor is created via open(), but a flag is passed in (O_NONBLOCK on most Linux kernels) to tell the kernel: If data is not available immediately, do not put me to sleep. Instead let me know so I can go on with my life. I’ll try back later.

Pros:

If no I/O is available other work can be completed in the meantime

When I/O is available, is does not block the thread (even models with green threads)

Cons:

Does not maximize I/O throughput for the application

Lots of system call overhead – constantly making system calls to see if I/O is ready

Can be high latency if I/O arrives and a system call is not made for a while

This model of I/O is typically very inefficient because the I/O system call made by the application may return EAGAIN or EWOULDBLOCK repeatedly. The application can either:

wait around for the data to finish (repeatedly calling its I/O system call over and over) — or

try to do other work for a bit, and retry the I/O system call later

At some point the I/O will either return an error or it will be able to complete.

If this type of I/O is used in a system with green threads, the entire process is not blocked but the efficiency is very poor due to the constant polling with system calls from user-land. Each time a system call is invoked a privelege level change occurs on the processor and the execution state of the application has to be saved out to memory (or disk!) so that the kernel can execute.

Asynchronous blocking I/O

This model of I/O is much more well known. In fact, this is how Ruby implements I/O for its green threads.

In this model, non-blocking file descriptors are created (similar to the previous model) and they monitored by calling either select() or poll(). The system call to select()/poll() blocks the process (the process is put into a sleeping state in the kernel) and the system call returns when either an error has occurred or when the file descriptors are ready to be read from or written to.

Pros:

When I/O is available is does not block

Lots of I/O can be issued to execute in parallel

Notifications occur when one or more file descriptors are ready (helps to improve I/O throughput)

Cons:

Calling select(), poll(), or epoll_wait() blocks the calling thread (entire application if using green threads)

Lots of file descriptors for I/O means lots that have to be checked (can be avoided with epoll)

What is important to note here is that more than one file descriptor can be monitored and when select/poll returns, more than one of the file descriptors may be able to do non-blocking I/O. This is great because it increases the application’s I/O throughput by allowing many I/O operations to occur in parallel.

Of course there are two main drawbacks of using this model:

select()/poll() block – so if they are used in a system with green threads, all the threads are put to sleep while these system calls are executing.

You must check the entire set of file descriptors to determine which are ready. This can be bad if you have a lot of file descriptors, because you can potentially spend a lot of time checking file descriptors which aren’t ready (epoll() fixes this problem).

This model is important for all you Ruby programmers out there — this is the type of I/O that Ruby uses internally. The calls to select cause Ruby to block while they are being executed.

There are some work-arounds though:

Timeouts – select() and poll() let you set timeouts so your app doesn’t have to sleep endlessly if there is no I/O to process – it can continue executing other code in the meantime. This what Ruby does.

epoll() (or kqueue on bsd)- epoll() allows you to register a set of file descriptors you are interested in. You then make blocking epoll_wait calls (they accept timeouts) which will return only the file descriptors which are ready for I/O. This allows you to avoid searching through all your file descriptors every time.

At the very least you should set a timeout so that you can do other work if no I/O is ready. If possible though, use epoll().

Asynchronous non-blocking I/O

This is probably the least widely known model of I/O out there. This model of io is implemented via the libaio library in Linux.

In this I/O model, you can initiate I/O using aio_read(), aio_write(), and a few others. Before using these functions, you must set up a struct aiocb including fields which indicate how you’d like to get notifications and where the data can be read from or written to. Notifications can be delivered in a couple different ways:

Signal – a SIGIO is delivered to the process when the I/O has completed

Callback – a callback function is called when the I/O has completed

Pros:

Helps maximize I/O throughput by allowing lots of I/O to issued in parallel

Allows application to continue processing while I/O is executing, callback or POSIX signal when done

Cons:

Wrapper for libaio may not exist for your programming environment

Network I/O may not be supported

This method of I/O is really awesome because it does not block the calling application and allows multiple I/O operations to executed in parallel which increases the I/O throughput of the application.

The downsides to using libaio are:

Wrapper may not exist for your favorite programming language.

Unclear whether libaio supports network I/O on all systems — may only support disk I/O. When this happens, the library falls back to using normal synchronous blocking I/O.

You should try out this I/O model if your programming environment has support for it and it either has support for network I/O or you don’t need it.

Conclusion

In conclusion, you should use synchronous blocking I/O when you are writing small apps which won’t see much traffic. For more intense applications, you should definitely use one of the two asynchronous models. If possible, avoid synchronous non-blocking I/O at all costs.

Remember that the goal is to increase I/O throughput to scale your application to withstand thousands of requests per second. Doing any sort of blocking I/O in your application can (depending on threading model) cause your entire application to block, increasing latency and slowing the user experience to a crawl.