24 Jan 2016

A while back I tried installing NixOS on my laptop. After a few weeks of using it I ended up wiping it and moving back to Archlinux. NixOS was intriguing to me because it meshes well with my philosophy towards system administration: Your hard drive is object code; you had better be able to build it from source.

That philosophy is increasingly common in industry, and there are a number of tools out there designed to support it. Puppet, Ansible, and Chef are some popular examples. All of them share a general approach:

Write a specification of the system in the form that the tool expects. For Chef, this is just ruby, though they provide some libraries. Puppet has its own custom DSL for this. Ansible also has its own custom DSL, but they pretend it’s just simple YAML config files.

Have a tool which will bring the system in line with the specification. Importantly, this operation is idempotent, i.e. applying it multiple times will have the same effect as doing so once. This means you don’t have to worry about whether something has already been done; you just hit apply and it does the right thing.

The tools have some differences, but they aren’t very interesting for the purpose of this post.

There are some deficiencies common to all of them as well. Most importantly, by default they don’t “own” the system; they’ll do what their spec says, but they’ll leave everything else alone. This is a problem because it makes it easy for differences between the running system and the spec to “sneak in.” If this happens, the administrator may be in for an unpleasant surprise when they try to rebuild the system from scratch: it turns out their source code doesn’t work.

Puppet has a purge option which lets you specify that any resources not explicitly managed by puppet should be removed. This sounds like a solution, but unfortunately it’s not very smart. Given this manifest:

# demo.pp # All files in /usr/bin which are not explicitly managed by puppet # should be removed. We probably want this for most of the system, # excluding a few directories like /home, /var/lib, /tmp... # # I tried to just do /usr, but puppet is choking on a socket that's # in there somewhere. /usr/bin will do for demo purposes. file { '/usr/bin' : ensure => directory , recurse => true , purge => true , } # Specify that all of the packages on my system are supposed to be # there. What we'd like to see is that this prevents their files from # being purged, but... package { [ # List of all of the packages on my system goes here. ]: ensure => present , }

Puppet will do this:

# puppet apply --noop demo.pp Notice: Compiled catalog for vulcan in environment production in 0.49 seconds Notice: /Stage[main]/Main/File[/usr/bin/2csv]/ensure: current_value file, should be absent (noop) Notice: /Stage[main]/Main/File[/usr/bin/2html]/ensure: current_value link, should be absent (noop) Notice: /Stage[main]/Main/File[/usr/bin/2to3]/ensure: current_value link, should be absent (noop) Notice: /Stage[main]/Main/File[/usr/bin/2to3-2.7]/ensure: current_value file, should be absent (noop) Notice: /Stage[main]/Main/File[/usr/bin/2to3-3.5]/ensure: current_value file, should be absent (noop) Notice: /Stage[main]/Main/File[/usr/bin/2xml]/ensure: current_value file, should be absent (noop) Notice: /Stage[main]/Main/File[/usr/bin/3dsp]/ensure: current_value file, should be absent (noop) Notice: /Stage[main]/Main/File[/usr/bin/7z]/ensure: current_value file, should be absent (noop) Notice: /Stage[main]/Main/File[/usr/bin/7za]/ensure: current_value file, should be absent (noop) ...

Puppet is unaware that those files are owned by packages that it’s managing (and thus should be considered to be managed by puppet), and so it blindly plows ahead deleting gobs of files critical for my system to function at all. It’s a good thing I used --noop , which tells it to just print out what it would do! This is obviously not what I want.

Tools like this also aren’t a complete solution, because they don’t deal with initial installation at all. Often the companies that build these tools have proprietary products that do that sort of thing. That’s not a situation I’m thrilled with, but there are FOSS tools like foreman that work similarly, and some simpler tools like Kickstart in the Redhat world, and preseed files for Debian.

NixOS is more aggressive than this. Rather than writing an extended description of what it’s all about, I’ll just refer you to their own site. The package manager can be used in a stand-alone fashion (on systems other than NixOS), but it’s playing a different role in that case. I’ll talk mostly about NixOS, rather than other uses of Nix and Nixpkgs.

NixOS makes it very easy to have the spec “own the system.” I have great confidence that plopping my configuration.nix on a new machine and using the same Nixpkgs tree will do the right thing. However, the system has some other quirks that make it unattractive to me.

NixOS is very aggressive about making sure undeclared dependencies don’t leak into your system, particularly when packages are being built. Dependencies of a package aren’t necessarily exposed on the final system. For example, if you install virtualenv, the python executable won’t necessarily be in your $PATH ; it will be tucked away in some place like /nix/store/fhqwhgadsfhqwhgadsfhqwhgadsfhqwh-python-2.7.11... .

The trouble with this is that it often requires packagers to do a lot of mucking with upstream software. Getting some packages to look for binaries, headers, shared libraries and so forth in unusual places can be finicky. This adds opportunities for bugs, and I’ve hit a few of them.

It also makes using language package managers inside the equivalent of virtualenv , cabal sandbox and friends… less than trivial. I’ve had trouble installing python C extensions in a virtualenv. As an example, with libxml2 installed on the host system, installing the python binding in a virtual environment didn’t work because the headers aren’t in the expected place; the bindings look in /usr/include , and they aren’t there. If the distro is packaging the bindings, it can patch them to make this work, but it’s not clear there’s a good way to get this to work with stock sources. I’ve also run into a few things that aren’t packaged, and packaging them turned out to be much more work than on most distros. The main difficulty was getting them to find their dependencies in /nix/store at runtime.

NixOS also delivers some features not relevant to the goal of “system as object-code,” and they make things yet more complex, and again, introduce opportunities for bugs. NixOS allows unprivileged users to customize their environment, which is a neat feature, but not something I need. One problem this feature raises for the implementation is how to deal with setuid binaries. NixOS has a solution for this, but it introduces some pitfalls. Here’s a handful of related bugs:

Doing nonstandard things always results in a certain amount of friction of this sort, but NixOS has a lot of it due to trying to do several different ambitious things. Most of these I don’t care about; in particular, the ability to do controlled per-user changes in an ad-hoc fashion is something that for my own systems, I actively do not want. To me, the entire appeal of NixOS is the ability to enforce discipline; I want to know that I haven’t shot myself in the foot by forgetting to put some experimental change in the system config file.

After bumping up against a number of these issues, I got tired of fighting with the system and went back to Arch. One of the great virtues of Archlinux in my experience, is that it doesn’t change much from upstream, and very frequently things just work that on other distros either need silly configuration tweaks or are just plain broken. This is particularly true from the perspective of a package maintainer.

I have a few thoughts on how I might go about building something that does what I want from NixOS, but is more respectful of upstream software’s expectations (and thus easier to build and maintain):

Enough of this symlink nonsense. Use a filesystem that supports snapshots (e.g. zfs or btrfs) and whatever other functionality you need.

Do one thing and do it well. I want to be able to reliably rebuild my system from source. All of these other ambitious goals get in the way of that.

Don’t try to hide runtime dependencies. If virtualenv depends on python, then python needs to be on the system for it to work. NixOS does some trickery to keep programs that don’t depend on it directly from finding it. While there’s an argument for that, it strikes me as more trouble than it’s worth.

NixOS is a great idea, but it doesn’t quite cut it for me. For now, it’s back to managing my systems with ansible. Maybe someday…