IncludeOS: a unikernel for C++ applications

Please consider subscribing to LWN Subscriptions are the lifeblood of LWN.net. If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net.

Is it truly an efficient use of cloud computing resources to run traditional operating systems inside virtual machines? In many cases, it isn't. An interesting alternative is to bundle a program into a unikernel, which is a single-tasking library operating system made specifically for running a single application in the cloud. A unikernel packs everything needed to run an application into a tiny bundle and, in theory, this approach would save disk space, memory, and processor time compared to running a full traditional operating system. IncludeOS is such a unikernel; it was created to support C++ applications. Like other unikernels, it is designed for resource-efficiency on shared infrastructure, and is primarily meant to run on a hypervisor.

Frequently, virtual machines end up running a full server operating system, though the entire instance is devoted to running only a few applications or even just one. However, every running instance on a physical machine means a full set of services and binaries that's unnecessarily replicated. Unikernel developers take the opportunity to aggressively pare down the operating system to a bare minimum. Unikernels are at the extreme end of the possible answers to the question "how small can you make an operating system?" A unikernel is an instance of a single program "baked together" with a small library that provides the operating system and acts as an interface to the (virtual) hardware.

A history of unikernels

The idea of shrinking the operating system has its roots in microkernel research, which was spurred by monolithic kernels that were growing in size and complexity to unwieldy levels. A microkernel implements only a tiny amount of necessary functionality in privileged mode (such as interrupt handling, low-level memory management, and scheduling), with the rest being implemented as servers in user space. Exokernels, which were proposed by systems researchers at MIT in the 1990s, take the concept further by implementing most of the operating system as custom libraries linked to applications. This concept of library operating systems proved popular, and a number of projects were created around the concept, such as Nemesis from the University of Glasgow, and Drawbridge from Microsoft Research.

The term unikernel was proposed by a group of operating systems researchers in a paper [PDF] from 2013 that described their MirageOS project. While early projects included various drivers to support a multitude of hardware much like a traditional operating system, unikernels were designed to primarily run on virtual hardware, so they do not need as much driver support. Unikernels are also compiled with just enough of the library to support the application contained within it, and nothing more. The idea is that unikernels could be deployed side by side on a hypervisor, much like regular programs are run on a traditional operating system.

Unikernels address the use case of needing strong isolation for a user's application on shared infrastructure. Multi-tenancy on clouds means that every user's application is completely separated from those of others, but requiring each user to run a full operating system is wasteful. Unlike Linux containers, which run a single instance of the kernel that partitions users' applications using namespaces, control groups, and security policies, unikernels benefit from the stronger resource isolation of hypervisors. They get that isolation while being nearly as lightweight as a container. The drawbacks to unikernels are that users are constrained by what the unikernel library provides in terms of operating system interfaces.

The choice of programming language to write a unikernel application in is also dependent on the underlying library support for it. IncludeOS supports C++, while MirageOS uses OCaml as its target programming language; other unikernel projects have been created that support languages like Haskell (HaLVM) and Erlang (LING). There is a collection of links to active unikernel projects found here.

IncludeOS

IncludeOS is a project to create a C++ API for the development of unikernel-based applications. When an application is built using IncludeOS, the development toolchain will link in the parts of the IncludeOS library required to run it and create a disk image with a bootloader attached. An IncludeOS image can be hundreds of times smaller than the Ubuntu system image for running an equivalent program. Start times for the images run in the hundreds of milliseconds, making it possible to spin up many such virtual machine images quickly.

When an IncludeOS image boots, it initializes the operating system by setting up memory, running global constructors, and registering drivers and interrupt handlers. In an IncludeOS unikernel, virtual memory is not enabled, and a single address space is used by both the application and the unikernel library. Therefore there is no concept of system calls or user space; all operating system services are called with a simple function call to the library and all run in privileged mode.

The unikernel is also single-threaded, and there is no preemption. Interrupts are deferred when they happen, and attended to at every iteration of the event loop. The design suggests user programs also be written to follow the asynchronous programming model, with callbacks installed to respond to operating system events. For example, a TCP socket can be set up in a user program and a callback inside the application handles the connection when a third party attempts to connect.

An advantage of IncludeOS's minimalist design is the reduction of the attack surface for the application. With a self-contained application appliance, there are no shells or other tools that would be helpful to an attacker if they manage to compromise the application. Additionally, the stack and heap locations are randomized to discourage attackers.

IncludeOS does not implement all of POSIX. It is the opinion of the developers that only parts of POSIX will be implemented, as needs arise. It is unlikely that full POSIX compliance will ever be pursued as a goal by the developers. Currently, there are no blocking calls implemented in IncludeOS, as the current event loop model is the favored way to use it. IncludeOS also lacks a writable filesystem at this point.

There are plans in the pipeline to implement threads as fibers, which are a cooperative form of threading. Since there is no preemption in IncludeOS, fibers yield voluntarily to give other fibers a chance to run. Apart from some standard C++ library calls, a special IncludeOS API is used to help construct applications as unikernels.

Business model

IncludeOS started off as a university research project at Oslo and Akershus University College of Applied Sciences; it was developed by Alfred Bratterud and his associates. The project spun off into a startup, founded by Bratterud together with Per Buer. IncludeOS is distributed under the Apache 2.0 license, with the code available on GitHub. Outside of the company, there is a small community of voluntary contributors that numbers around a dozen people. Although most contributions from volunteers are small bug fixes, there have been some considerable contributions by IBM, which added support for running IncludeOS on ukvm.

As a company, IncludeOS is still in the early stages. According to Buer, most of the funding it has received is in the form of grants from the Norwegian government. The code for the IncludeOS unikernel is open source, but there is a plan to create proprietary enterprise management tools for running unikernels in large deployments in data centers and in the cloud. The company has acquired a customer that it is adding features for, such as network load balancing, a firewall, and additional hardening of the codebase. Other missing features will be added as needed, which will primarily be driven by the business needs of customers.

Trying it out

Currently, there are no IncludeOS packages for Linux, but there are instructions on how to create a unikernel from the source code. IncludeOS works on KVM/QEMU and VirtualBox; in theory it could also boot on bare-metal hardware, but this has not been verified.

Since the code is currently not yet meant for production, the results of following the instructions may vary. I tried multiple installations in different versions of Ubuntu and got as far as compiling a unikernel image and running the sample application, which is an HTTP server. However the network bridging between the unikernel and its host was not set up right, and thus I could not connect to it from a web browser. Despite the helpful support from members of the developer community in the IncludeOS development chat room, something in my set up caused problems that could not be reproduced. The compilation and installation scripts are rough around the edges, so any user trying them out may also face problems. Ubuntu users will need at least version 16.04 to build the latest version of IncludeOS.

Conclusion

Despite the popularity of cloud computing and virtualization, we are still trying to figure out the best ways to take advantage of the technology. Containers grew out of the desire for lightweight partitioning of guest applications, but unikernels appear to provide an even better option with stronger isolation. The downside of a completely new operating system and programming paradigm is that most legacy software will not work on it without significant modification. However, lightweight, virtualized, and isolated software appliances are a logical way to run applications in the cloud; as IncludeOS and other unikernels become more sophisticated, it may become the primary method of deploying such services. With several different competing unikernel projects taking off, it will be interesting to see how IncludeOS (and the unikernel paradigm itself) fares against more traditional operating systems. Unikernels are highly specialized, and it remains to be seen if the lightweight virtualization aspect of deployment is enough of an incentive for developers to invest time and resources into building applications in this manner.

[I would like to thank Per Buer and the rest of the IncludeOS development community for their feedback when writing this article.]