12 Aug 2013

Lately, I’ve been having a lot of fun using Datomic on a couple of projects. One of them is something we’re calling torch, a source code analysis and visualisation tool built on top of codeq. It’s mostly an experiment which will hopefully help us understand and visualise some aspects of our entire Clojure code-base at uSwitch. If you happen to be attending FP Days in Cambridge, you might get to see more of torch.

torch is currently a single-page browser application built in ClojureScript, and in this post, I’ll show how we leverage the power of Datomic from the browser using core.async and the Datomic REST API.

Going Hypertext

A big part of torch is just querying Datomic for facts, possibly deriving some new information, in order to then visualise it.

Our initial prototype was a Java application but for a couple of reasons which are not important for this post, we have since moved to a browser based ClojureScript app.

One thing that was obviously really easy in the Java application was talking to Datomic, we could just use the standard Clojure API. Querying Datomic is most commonly done using its dialect of Datalog, and the process of connecting and performing a query looks something like this:

;; create a connection to the codeq database on the Datomic ;; system running on localhost:4334 ( def conn ( datomic.api/connect "datomic:free://localhost:4334/codeq" )) ;; capture the most recent value of the database known by this peer ( def db ( datomic.api/db conn )) ;; perform a query ( let [ res ( datomic.api/q ' [ :find ?e ?v :where [ ?e :db/doc ?v ]] db )] ... )

The query finds all entities with a :db/doc attribute, and returns the id of each entity and the documentation string.

It does this by calling the q function in the datomic.api namespace, passing in a query expressed using a data structure, and the database value that should be used as the basis for the query.

So, moving to a browser based app, how do we query Datomic?

Datomic REST API

Fortunately, Datomic provides a REST API out of the box. If you have a Datomic free system running on localhost:4334 , you can start a HTTP server exposing a REST interface to that system on port 9000 like this:

> bin/rest -o "*" -p 9000 free datomic:free://localhost:4334/

The -o specifies allowed origins, we’re being overly liberal in this example. With this process running, we can now fire off queries over good old HTTP.

The query resource, exposed at /api/query requires two parameters:

q — the query

— the query args — a vector of arguments to the query

Both arguments should be the URL encoded EDN representation of its data.

args is a little bit special in that it always needs to include at least one element, {:db/alias "your-storage/your-db"} , identifying which database to query. In our example, it would be {:db/alias "free/codeq"} .

Compared to 99.999% of all HTTP APIs on the web, Datomic also makes the very unorthodox choice of using a sane format for data interchange, EDN. This means we need to indicate that we accept content of type application/edn .

So how does the same query look using this approach?

curl -H "Accept: application/edn" "http://localhost:9000/api/query?q=%5B%3Afind+%3Fe+%3Fv+%3Ain+%24+%3Awhere+%5B%3Fe+%3Adb%2Fdoc+%3Fv%5D%5D&args=%5B%7B%3Adb%2Falias+" free%2Fenergy-pal-perf "%7D%5D&offset=&limit="

A slight step back from the elegance of the Java/Clojure API, but the ability is there. So how can we use this in ClojureScript?

A Diversion on HTTP

The Google Closure Library provides a few ways to do HTTP requests, the most basic one being the static method goog.net.XhrIo/send which provides an interface very similar to most AJAX-capable JavaScript libraries. We can use this method for writing a very simple function that can GET some EDN from a given URL:

( defn get-edn [ url callback ] ( goog.net.XhrIo/send url ( fn [ e ] ( callback ( edn-response e ))) "GET" nil ( clj->js { "Accept" "application/edn" }))

This should seem fairly familiar; we pass in a URL, a callback, the HTTP verb, maybe a body ( nil in this case) and some headers.

Before calling the user-supplied callback, we apply edn-response to the result of the operation to save the caller from having to mess with the XhrIo result object. So, edn-response is a function that receives the result of an XhrIo/send , checks if it was successful and returns either a vector of :ok and the EDN data in the body, or :fail and the status code.

( defn edn-response [ e ] ( let [ res ( .-target e )] ( if ( .isSuccess res ) [ :ok ( cljs.reader/read-string ( .getResponseText res ))] [ :error ( .getStatus res )])))

As a consumer of get-edn , you would use it like this:

( get-edn "http://localhost:9000/api/query?q=..." ( fn [[ status value ]] ( .log js/console ( str "Got status " status " and value " value )))

Nothing revolutionary here. However, there is a fundamental difference from the in-process Java version. The fact that we have to supply a callback that will be invoked when the operation has completed means that we give up some of our control over the flow of the program. This is unfortunate, but the asynchronous callback based model is simply something we have to live with when doing I/O in JavaScript.

Or is it?

Turning the World Inside Out With core.async

The core.async library is another piece of very inspiring work which we in this particular case can use to invert the control of asynchronous HTTP calls.

We can modify our get-edn function to not take a callback, but instead return a channel on which the result of the operation will be put when the the operation completes. For this example, you can think of a channel as a queue with room for exactly one element. This could look something like this:

( defn get-edn [ url ] ( let [ c ( chan )] ( goog.net.XhrIo/send url ( fn [ e ] ( put! c ( edn-response e )) "GET" nil ( clj->js { "Accept" "application/edn" })) c ))

This time, we create a channel using chan , fire off the request and return the channel immediately. When the request has completed, the callback function will use edn-response as before and then simply put! the resulting value on the channel.

We can use our channel based get-edn like so:

( go ( let [[ status value ] ( <! ( get-edn "http://localhost:9000/api/query?q=..." ))] ( .log js/console ( str "Got status" status " and value " value ))))

This short snippet contains a lot of core.async concepts but one way of understanding it is as follows:

We call get-edn which will return a channel on which the result of the request will eventually be put! We use <! to read a value from the channel. If no value is available yet, execution will be parked at this point, and resumed when a value becomes available. This parking is made possible by the go macro, which wraps our code.

This is really only scratching the surface of what’s possible with core.async .

The important bit is that we’ve regained control of the program flow, we’re no longer forced to hand over a callback that will be used to resume our execution in some other place.

Tying the Knot

We now have a nice way of doing HTTP requests for getting EDN, which we can use to talk to the Datomic REST API.

As we’ve seen, the REST API requires two parameters, q the query, and args , any args to the query. Both are expressed using Clojure data structures, but when transmitted they should be the URL encoded EDN representations of this data. Unsurprisingly, the Closure Library knows how to URL encode strings, and ClojureScript how to generate EDN from Clojure data. We combine this knowledge in the encode function:

( defn encode [ data ] ( goog.string/urlEncode ( pr-str data )))

To perform queries, we need to know where the URL of the REST API, which storage it uses and the database we wish to query. We introduce a connect function that given those parameters returns a map with all information we need to perform multiple queries.

( defn connect [ rest-api-url storage db-name ] { :url rest-api-url :db/alias ( str storage "/" db-name )})

Calling it connect is a bit of a misnomer given what it currently does. I’ll discuss this in the next section.

We can now implement a q function with an interface that is superficially quite similar to what we have in the Clojure/Java API:

( defn q [ query conn & args ] ( let [ args ( into [( select-keys conn [ :db/alias ])] args )] ( get-edn ( str ( :url db ) "/api/query" "?q=" ( encode query ) "&args=" ( encode args )))))

Finally, we can perform the same query as above from ClojureScript:

( def conn ( connect "http://localhost:9000" "free" "codeq" )) ;; perform a query ( let [[ status res ] ( <! ( q ' [ :find ?e ?v :where [ ?e :db/doc ?v ]] conn ))] ... )

The Database as a Value (?)

It should be noted that the interface of the q function is actually quite different from the native API. We perform the query directly on a connection, rather than a database value. This means we lose the stable basis, which is a big deal. Every query will execute on whatever the latest value of database is when the query is received.

However, the Datomic REST API supports specifying the value of the database you wish the query to be performed against, and it is also possible to query for the current version. This means that it is possible to implement the notion of a database value even in the browser. In the interest of keeping this post to a reasonable length, I’ll stop here.

are completely awesome about Datomic, and you should give it a try and find out what they are.

The Language of the System talk for a very good motivation for this.