First, a small introduction or, more accurately, a bit of background about me. My official education is not in computer science — my academic studies focused on philosophy, cinematography and literature. As such, I usually approach technology in a more “philosophical” or abstract manner than your typical engineer, while the concrete manner follows as a close second. As far as programming languages go, I usually ask one primary question -

“How does this language change the way I perceive the task I’m about to tackle?”

That being the case, I tend to strongly emphasize the following:

In the microservice world, each service usually deals with a specific Problem Space that we want to solve. Some languages solve certain aspects of specific Problem Spaces inherently. Some languages introduce new issues and further complicate a specific Problem Space.

I won’t go into too much of the philosophical theory behind these three considerations . If anyone wants to delve deeper, I encourage you to read a bit about what Jaques Derrida likens the “the blind spot” of text and its effect on the reader. In relation to our world, our code is the text and it has inherent “blind spots” introduced by both the writer and the actual language used!

So let’s get back to our original question: Why Clojure? (Clearly we get this question often).

To answer that question, it’s worth mentioning a bit about what AppsFlyer actually does. In essence, AppsFlyer is a mobile attribution and analytics platform. In human parsable language, this means that marketers use us in order to ascertain how well their mobile applications are performing on the marketing front. In order to do that, we ingest a lot of advertising traffic and a lot of mobile traffic and perform our own magic/voodoo to generate those insights (and by a lot, I mean 55+ billion http requests a day).

So what do we have here:

A heavy amount of traffic flowing through the system. A complex system that knows how to synthesize actual attribution and analytics based on that traffic Further sub-systems that need to take care of the after effects of this attribution — anything from postbacks to visualization.

OK, we now kind of know the actual Problem Space. If you were in our shoes seven years ago and had to write that kind of system, what programming language would you choose? The simplest, and most correct answer, would probably be “the language I’m most familiar with”.

Why is that the best answer? Because a startup can’t deal with the abstract concept of “ideal solution” — a startup needs to deliver as fast as possible, and you’ll probably do it the fastest with what you already know.

When HTTP requests grown from hundreds of millions to billions — you need to rethink your base assumptions.

But what happens if, a year later, your best dream comes true and your startup is now 10X or 100X bigger than what is was last year? Well, you can go down one of two routes: continue your course and hope the tech you chose will serve you best in the years to come, or you can stop and try to make an educated and hard decision.

So, why did we want to swap out our Python backend? Well, we had various reasons, but the main one was scale — we had a hard time dealing with the incoming traffic and we wanted to use a language where it’s easier to do parallel and concurrent work.

Clojure is a Lisp dialect that runs over the JVM. It brings to the forefront the following concepts: :

Functional Programming: This revolves mainly around the notion of modeling. Instead of trying to model the Problem Space into classes that encapsulate their internal functions, the functional paradigm promotes the idea of functions as first class citizens that are composable and enable data to flow through them. The actual classes (if they exists) are used mainly as containers of data — not as a black box of logic.

Immutability: which is, for me, the crux of functional programming. Immutability means that you can send your data into two different functions, and you’re guaranteed that your original data won’t change. Why is that so good? Well, with immutability you don’t have to fear the following question: “will this function change my data? If so, do I need to synchronize access to it so that another function won’t also try to change it at the same time?” The concept of computational safety when working in parallel or concurrently is essential to the ease of scaling big systems. The only sane way of doing it is via immutability. There’s a lot to be said about mutable vs. immutable, so we’ll be sure to cover this more deeply in future posts.

Lisp: Clojure is a Lisp and, as such, it comes with two important attributes:

Simplicity — Lisp, once you “grok” it, is actually pretty simple. In Clojure you essentially only have four basic collections, and a diversity of functions that work on all of them. Everything is easily nested inside of each other, and calling (first xs) will always return the same result — the first element of xs — regardless if xs is a list, vector, map or set. Also, the “dreaded” parentheses are simply the best indicator of your lexical scope — if it’s in the parentheses, then it’s in scope — simple as that. Data as a first class citizen. If functional programming promotes functions as first class citizens, Lisp makes data a first class citizen as well. Data is no longer “hidden” behind, often times, obscure classes — data is at the forefront and is modeled very easily using native data structure, usually using a map. Data is such a prominent “citizen” of Clojure that maps are their own functions over their keys so you can invoke something like (a-map :key) to fetch the value of :key from the a-map. Besides that, Lisps are Homoiconic (data is code and code is data) and usually referentially transparent.

Concurrent/Parallel Paradigms: Clojure has native support for concurrency and parallelism in a really easy manner. We have futures, promises, pmap, atoms, and even refs and agents “powered” by Software Transactional Memory (STM). Besides the native support that resides in the clojure.core namespace, we also have the clojure.core.async library — the Clojure way of doing CSP (communicating sequential processes) — only in Clojure we don’t have to be sequential. This is the Clojure way of modelling systems using channels and buffers and this notion had a far reaching impact on AppsFlyer’s overall system design.

So, again we ask the question — why Clojure? Sure, a lot of functional programming languages have some of the principles written above, but none of them are both Lisp (which is super ideal for processing an endless stream of data like we do/model ourselves in AppsFlyer), and run on the JVM. Yes, there are really good VMs out there (the BEAM is amazing for example), but the JVM simply has one of the world’s largest ecosystems in regards to open source libraries — all of them easily invoked from Clojure.

If I reference the start of this post for one extra minute, I just want to reiterate that we modeled AppsFlyer as an endless stream of incoming and outgoing data. Once we did that, the Lisp “philosophy” lent itself perfectly to solving this specific Problem Space. Actually, it fits together so perfectly that I have a hard time deciding whether we modeled AppsFlyer as an endless stream of data because of Clojure or after we started using it. The fact remains that, for us, it’s the perfect tool for the job.