First thoughts on Liberator January 26, 2014

I’ve been using Liberator to build a backend API for my Sokoban game.

Liberator, a Clojure library, offers a truly different and mind-bending approach to web apps. You define a resource:

(defresource hello-resource :available-media-types ["text/plain"] :allowed-methods [:get] :handle-ok (fn [_] "Hello, world!"))

And add it to a Ring handler, most likely using Compojure:

(defroutes app (ANY "/hello" [] hello-resource))

We can see a small quirk of the Compojure/Liberator pairing: normally Compojure is in charge of which methods are allowed, and the ANY in that route would instead be a GET (or POST , PUT , etc.).

But Liberator handles stuff like that. If you post to the endpoint, it produces a semantically-correct “405 Method Not Allowed” response and adds an Allow: GET header, as the HTTP spec requires. In contrast, with Compojure, attempting to post to a resource that only has a GET route defined yields a not-quite-right 404.

That’s Liberator’s big idea: automatic implementations of correct HTTP behavior. It deals with all the nitty-gritty of negotiating content types, character sets and languages. You just tell it what it needs to know to make those decisions.

Example: a blog post

Here’s a longer example, a post on a blogging site. Anyone can read a blog post (with a GET ), but only the user who wrote it can edit it (with a PUT ) or DELETE it.

(defresource blog-post-resource [id] :allowed-methods [:get :put :delete] ;; Return 503 Service Unavailable if DB is down. :service-available? (fn [_] (is-the-database-up?)) ;; Return 401 Unauthorized if method is not GET and a correct ;; authorization header isn't included. :authorized? (fn [ctx] (if (= (get-in ctx :request :request-method) :get) true ; Anyone can simply read the post, no auth required. ;; If we're editing or deleting, then first decode the HTTP ;; Authorization header, parsing the username and ;; password... (when-let [[user pass] (some-> ctx (get-in [:request :headers "authorization"]) parse-basic-auth)] ;; Then check that they're correct. (when (password-correct? user pass) {:logged-in-user user})))) ;; Return 403 Forbidden if method is not GET and the user who logged ;; in didn't write this blog post. :allowed? (fn [ctx] (or (= (get-in ctx :request :request-method) :get) (= (:author (lookup-blog-post id)) (:logged-in-user ctx)))) :handle-ok (fn [ctx] "render the blog post here") :put! (fn [ctx] "update the blog post with edited content") :delete! (fn [ctx] "delete the blog post"))

That’s a lot at once. I’ll go through bit by bit.

First, service-available? is one of a number of decision functions Liberator will use if you define them. Whether it returns a truthy or a falsy value determines which path Liberator will take through its HTTP state machine. If you look at the transition graph (huge image — requires panning and zooming), you’ll see that service-available? is the first decision point. If false, Liberator immediately returns a “503 Service Unavailable” error and won’t call our other functions.

Next we check authorized? , then allowed? . These are easily confused. Basically, “authorized” means the server knows who you are, and “allowed” means you’re permitted to do the thing you’re trying to do.

If it’s a GET request, we figure the post is public and anyone can read it, so we simply return true in both functions.

Otherwise, authorized? looks for “Authorization” in the request headers, and upon finding it, tries to parse it as HTTP basic authentication, using a utility function parse-basic-auth that I omit here. That produces a username and password. We send them to a hypothetical password-correct? function.

Recall that when returns nil if its condition is untrue, so if the auth header is missing, or the password isn’t correct, we return nil and the request is considered not authorized.

In case the user/password combination is correct, we don’t return true, but rather a map:

{:logged-in-user "username"}

This is key to understanding Liberator. I fibbed when I said decision functions return truthy or falsy values. They can also return maps. If so, it’s treated as a truthy value and the map is merged into the request context (the ctx argument passed into each function).

So in all future decision, action and handler functions that execute after allowed? , we can refer to (:logged-in-user ctx) and this will evaluate to the username we’re returning.

Much further on in the state graph, after a bunch of decisions I didn’t implement (which will therefore use the default behavior), we eventually get to the side-effecting function put! or delete! which actually makes a change. Or, if it was a GET , we proceed to handle-ok which renders the blog post. In my example code, these are obviously stubs.

The limits of state machines

I basically like the idea of Liberator (props to David Y. Kay for praising it last year and piquing my interest). If nothing else, it’s already changed how I think about HTTP and APIs.

But I’ve grown skeptical of Liberator’s state-machine model. I’ll go over some practical issues I see. Liberator is still new to me, and I undoubtedly have more to learn about using it effectively, so take my critique with a grain of salt.

Conceptually imperative

Liberator’s model is imperative in nature. Webmachine, the Erlang library on which Liberator is based, touts the referential transparency of its functions. If we disregard database reads, then sure, the decision functions are referentially transparent — all the state is encapsulated in a context variable.

But you still have state! A decision function that returns “true, and also here’s a context value to pass to the next function” isn’t really just a decision function. That style complects the computation of data with the evaluation of decisions based on that data.

You end up having to think about sequencing: by the time you get to :foo? , will :bar? have run? If not, you won’t have the :sputz value.

This is the kind of mental overhead declarative programming is supposed to help us avoid. It also makes our code more brittle. For example:

Handling malformed Authorization headers

The HTTP Authorization header has to have a certain format: “ user:pass ” base64-encoded. If we just base64-encode “foo”, or pass something that isn’t valid base64, we’ve violated the format. The code above will simply respond “401 Unauthorized” in that case, as if the Authorization header were missing entirely. But suppose we want to respond with a more correct “400 Bad Request”.

We thus add a malformed? decision function:

:malformed? (fn [ctx] (when-let [auth-header (get-in ctx [:request :headers "authorization"])] (if (parse-basic-auth auth-header) false true)))

If there’s an auth header, and it fails to parse, we return true.

Thing is, what we’ve written has many similarities to what :authorized? does. You may recall this part:

(when-let [[user pass] (some-> ctx (get-in [:request :headers "authorization"]) parse-basic-auth)]

Two different bits of code that both look up [:request :headers "authorization"] in the context, then pass it to parse-basic-auth , is a clear violation of DRY.

To remove the redundancy, we have to save user and pass on the context during the malformed? step and look them up in the context in authorized? .

:malformed? (fn [ctx] (when-let [auth-header (get-in ctx [:request :headers "authorization"])] (if-let [[user pass] (parse-basic-auth auth-header)] ;; Next line is how you return false while also adding keys to ;; the context map in Liberator. [false {:unchecked-credentials [user pass]}] true))) :authorized? (fn [ctx] (let [[user pass] (:unchecked-credentials ctx) logged-in-user (when (password-correct? user pass) user)] (if (and (not= (get-in ctx :request :request-method) :get) (not logged-in-user)) false {:logged-in-user logged-in-user})))

A simple change has produced unnecessary code churn. We still parse the authorization header the same way, but because we now need it sooner, existing code has to be rearranged.

Returning JSON errors

Here’s another ugly consequence of the state-machine nature of Liberator. Suppose you want your error responses to include a JSON blob describing the error.

This is a bit of a stretch in the context of the blog post example, but humor me, because I did run into this with my Sokoban API.

Normally, Liberator will render plain hashmaps for you as plain text, JSON, CSV, HTML, EDN or the format of your choice, based on the client’s Accept header. For instance, if we don’t mind returning our blog post in a very raw form, we can write handle-ok like this:

:handle-ok (fn [ctx] (let [blog-post (lookup-blog-post id)] {:author (:author blog-post) :title (:title blog-post) :date (format-date (:created-date blog-post)) :body (render-markdown (:body blog-post))}))

Then, an API consumer who requests Content-Type: application/json will get this:

{"author": "Scott Feeney", "title": "First thoughts on Liberator", "date": "January 26th, 2014", "body": "<p>I've been using Liberator to build a backend API for..." }

And this just works! No need to explicitly call a JSON lib or check if the media type is JSON.

So you would think we can do the following, too:

:malformed? (fn [ctx] (when-let [auth-header (get-in ctx [:request :headers "authorization"])] (if (parse-basic-auth auth-header) false ;; Return a map instead of just `true`. {:reason "Couldn't parse authorization header"}))) :handle-malformed (fn [ctx] ;; Provide a JSON blob (or whatever) with a "reason" why the request ;; was malformed. {:reason (:reason ctx)})

Unfortunately, this crashes. Hard.

To see why, we have to look at the decision graph again. If malformed? returns true, Liberator short-circuits execution of media-type-available? . And the decisions table tells us media-type-available? has the side effect of setting [:representation :media-type] in the context. With that value unset, Liberator doesn’t know how to render our map.

It turns out this flaw was already reported on Liberator’s issue tracker. A workaround is to explicitly set the media type:

:malformed? (fn [ctx] (when-let [auth-header (get-in ctx [:request :headers "authorization"])] (if (parse-basic-auth auth-header) false {:reason "Couldn't parse authorization header" :representation {:media-type "application/json"}})))

By doing so you’re giving up Liberator’s automatic content type negotiation. The client always gets an application/json error, even if they asked for text/html. You also have to add this :representation {:media-type "..."} boilerplate to your allowed? , authorized? and service-available? checks — anything that happens before media-type-available? and wants to return an error representation.

Philipp Meier, Liberator’s main author and maintainer, says he’s working on a fix. I look forward to seeing his plan, though I can’t help feeling this problem follows naturally from the design.

Out of order errors

The state machine only provides one opportunity to throw each type of error. If you don’t respond “503 Service Unavailable” at the very beginning, then you can’t respond 503 later.

Real application errors aren’t that orderly, and 503s are a great example. To edit a blog post, we have to call out to our database server, which could go down at any moment. I alluded to that in my example code above:

;; Return 503 Service Unavailable if DB is down. :service-available? (fn [_] (is-the-database-up?))

Unfortunately, we have a race condition here. In between the call to service-available? and the call to put! , the database can go down. We really want put! to attempt to connect and throw back a 503 if it fails.

Liberator doesn’t give us tools to do that. I think it may be possible if we write a handle-no-content function (the handler called after a successful PUT ) that checks if the database connection failed, and if so, explicitly overrides the status code with a 503. Even if this does work, it’s a hack that throws away some of the benefits of using Liberator in the first place.

Reusability could be better

Many of the decisions and handlers in our resources are likely to be general enough they apply to a whole category of resources, or even our whole API.

For example, errors. We may want all our error responses’ bodies to follow a consistent format: a JSON blob with "success": false and "reason" containing an English-language description of the problem.

;; Assume for simplicity that Liberator has fixed the media-type ;; bug with error response maps. (defn error-response [ctx] {:success false :reason (get ctx :reason "Unknown error")}) (defresource foo-resource ;; ... :handle-malformed error-response :handle-unauthorized error-response :handle-forbidden error-response)

Authentication is another example. The process by which we parse the “Authorization” header, extract the supplied username and password and check their correctness, is the same for every authenticated resource in the API.

Both situations are just crying out for OO-style mixins.

(def authenticated {:authorized? (fn [ctx] ... )}) (def provides-error-responses {:handle-malformed error-response :handle-unauthorized error-response :handle-forbidden error-response})

But defresource doesn’t take a map, so we can’t just merge these in. To use mixins like these, we would need to abandon the defresource sugar and resort to the underlying run-resource function, which is undocumented.

A recent pull request, not yet merged, proposes to make this better.

No PATCH support

I was disappointed to find the highly useful HTTP PATCH method unsupported. Adding support wouldn’t be a huge task, but its absence seems like a bad sign. I’m not sure how you would design a simple, efficient, semantic API without PATCH .

What I want to write

The way I see it, Liberator suffers from two fundamental flaws. Its imperative style complects computation and lookup of intermediate results with the making of decisions. And its use of a DAG as the sole arbiter of control flow doesn’t properly address all situations that come up when serving HTTP.

To address the first problem, I’d like to write resources in a dataflow style, along the lines of Prismatic’s Graph library.

I’m less sure how to handle the second problem, but I think allowing thrown exceptions for HTTP errors may help. Some errors make sense to handle in a specific place in the DAG — for example, a “406 Not Acceptable” will either occur during content type negotiation or not at all. Other errors, like a 400, 500 or 503, can happen anywhere. For these, exceptions are a better fit.

Just for fun, here’s how that might look:

(defn blog-post-resource [request id] {:authorized? (fnk [logged-in-user request] (or logged-in-user (= (:method request) :get))) :allowed? (fnk [logged-in-user author request] (or (= author logged-in-user) (= (:method request) get))) :author (fnk [id] (:author (lookup-blog-post id))) :logged-in-user (fnk [specified-credentials] (let [[user pass] specified-credentials] (when (password-correct? user pass) user))) :specified-credentials (fnk [request] (when-let [auth-header (get-in ctx [:request :headers "authorization"])] (if-let [creds (parse-basic-auth auth-header)] creds (throw (http-error 400 "Couldn't parse authorization header"))))) ...)

(What’s fnk ? Briefly, it’s a function that supplies the value named in its own key, given the values named in its argument list. You can think of all the intermediate values as living in a map. See Graph’s readme for more.)

Now the decision functions authorized? and allowed? are truly just decision functions. No imperative code has snuck into them. The code is simple.

Also notice we don’t have a malformed? function. Instead, if we detect a malformed value while computing :specified-credentials , we throw a “400 Bad Request” error on the spot — right where we try to read the potentially malformed value.

This is obviously just a sketch. I haven’t written a library that works this way, and I’m sure there would be challenges in doing so. But this is how I would someday like to write the Liberator resource that began this post.