Often we hear that Project FiFo uses an unusual technology stack. Looking at the past few years of development experience, the stack has proven to be a positive one that continues to prove itself to be the right choice. The following should serve as a rationale for the choices behind each component, and some of the experiences we have had along the way.

The OS

We choose to run all FiFo and related services on SmartOS, the reason for this is simple. Since FiFo manages SmartOS clouds – it would be silly not to “eat our own dog food” and to run FiFo on something else.

We use vanilla SmartOS, so that there is no dependency on FiFo for your running VM’s. You could just switch FiFo off and all your VM’s would continue to just work. It also comes with a number of great advantages:

In our opinion, ZFS is simply the only file system that should ever be used – period.

is simply the only file system that should ever be used – period. Compression, ARC and ZIL work incredibly well, especially for DalmatinerDB which achieves amazing throughput partially thanks to being purpose built to take advantage of these features.

work incredibly well, especially for DalmatinerDB which achieves amazing throughput partially thanks to being purpose built to take advantage of these features. ZFS Checksumming gets rid of a lot of potential headaches.

gets rid of a lot of potential headaches. Zones are a wonderful way of achieving isolation, it allows FiFo to deploy on the same infrastructure it manages, taking away the need of for extra dedicated servers that other cloud systems require and allows FiFo to manage itself.

are a wonderful way of achieving isolation, it allows FiFo to deploy on the same infrastructure it manages, taking away the need of for extra dedicated servers that other cloud systems require and allows FiFo to manage itself. DTrace has helped a lot for debugging especially for DalmatinerDB, where performance matters a lot more then for the other components, this has been a big win. Custom DTrace probes sprinkled through-out the system allow for accurate tracking and insight into what is going on. DTrace has helped to find some very interesting bugs that may not have otherwise been discovered.

Low level Code

For obvious reasons, our low level code is written in C to efficiently interface with C libraries like DTrace, kstat, libzdoor. There is not much way around this, and it has been one of the biggest pain points. Probably a good part of this pain is due to the lack of experience with the C language and the availability of good tooling – but “good grief” multi threaded C code can be a real pain! But on the upside Erlang’s NIFs make integrating with C code rather easy!

Control plane (backend)

Erlang, the language was designed for control plane applications, it is a perfect fit. The whole design fits this wonderfully, the building blocks make it a lot easier then other language to build reliable code. The failure characteristics are fantastic, the visibility nearly unmatched. The ability to cluster multiple VMs has been a major factor in our design. The existence of libraries like riak_core are the corner stones of the highly available design that FiFo offers. All this has made it possible to build a masterless system that can keep operating in the face of node failures or network partitions.

Some people find the syntax to be odd, yet our experience is that it matches the task rather well, it makes it easy to express the kind of problems we’re dealing with and to catch problems before they can cause waves. The failure recovery that comes with the FiFo design is outstanding. More then once has a small problem which might have otherwise crashed less capable systems – been prevented from escalating by the supervisor restarting a process and giving us time to analyze and fix the issue without any impact.

Given that FiFo is a distributed application, we deal a lot with sockets. HTTP, while nice on the edge, simply isn’t a good format for internal communications. Erlang’s handling of sockets makes using TCP easy, we expanded it by adding a library for building client and server applications around mDNS discovery and use that to extend automatic failure recovery to the edge of services.

The UI (frontend)

After some rather unpleasant experiences with angular.js we ended up rewriting the entire UI in ClojureScript / Om. Both are extraordinary systems, Clojure is a joy to write and Om makes web development at least halfway fun. The expressiveness of Clojure is stunning, and as one can guess by the fact we use Erlang in the backend we do have a thing for functional programming. There is om-bootstrap which presents a convenient wrapper around many bootstrap components, while still a bit rough around the edges it is very nice to get something up quickly and makes responsive design a lot easier then starting something from scratch yourself.