Buckling Down

When working in a startup environment, time to production is quite crucial. Putting together some code to prove an idea is feasible, releasing a Minimal Viable Product for demonstration purposes, oftentimes developers have to choose the quickest path to write down their code. Node.js is a great tool for such fast-paced work and we’ve used it a lot. Coincidentally, it is also one of the languages available as runtime for the Lambda platform.

However, as flexible as Javascript is, it is also quite difficult to maintain in the long run. There is an inherent lack of structure in it which makes it very easy to quickly patch together a small demonstration project but, later on, creates potential hassle when updating or refactoring parts of the code, or when asking new developers to work on code they’ve never seen before.

In order to make Javascript more suitable for large-scale industrial applications, the BuckleScript framework has recently emerged, originally supported by Bloomberg. BuckleScript interfaces Javascript native code with OCaml. You start by defining bindings from Javascript to OCaml, then write OCaml code using those bindings. Finally, the BuckleScript compiler creates Javascript code from your OCaml code that you can deploy as native Javascript. The benefit of this loop, or transpiling, is that you then get to use all the fabulous tools that the OCaml language provides in your Javascript application.

OCaml’s Toolbox

Developed initially at France’s INRIA research center, OCaml is a functional language, with a statically inferred type system, an efficient Garbage Collector and a pragmatic approach to the ML language paradigm. Similar to Haskell, it provides the usual tools of functional languages but is also amenable to some imperative primitives whenever appropriate, for instance for I/O programming. Recently, it has enjoyed an interesting surge in various industrial applications, from Financial Trading to Facebook and recently Uber. Its ecosystem of development tools has greatly improved in the past years, in particular with the addition of a solid package manager, OPAM. All-in-all, even though still a niche language, this constitutes a pretty successful transfer from academics to software industry.

One of the very important feature of OCaml is its static type system. In OCaml, every variable has a type that is determined at compile-time and, if your code isn’t too tricky, most of the time, you do not have to explicitly write which type those variables should have. This means that, if you code compiles, then you are guaranteed that your functions will always be called with arguments of a specific, fixed type.

In Javascript world, this means that you no longer have to deal with variables being null or receiving a string when you expected an int . Furthermore, when refactoring your code, any change in a function’s expected type trickles down via the compiler into each and every call to that function. The code will not compile until you have fixed all the calls to that function, making refactoring a safe and truly enjoyable experience — something that is very important when dealing with large scale applications and developer teams.

In the following, I’ll be showing some OCaml features that can be very useful when writing code for BuckleScript/Javascript. If you are not familiar with the language, you might want to start with taking a peek at Real World OCaml, the reference book for OCaml programming.

Lambda handler’s type

In Lambda, a function is essentially a callback handler. It receives three arguments, a JSON object describing the function’s parameters, a context object, and a callback to be executed when the function has finished its execution. In OCaml/BuckleScript, this can be described with the following type:

type error = exn Js.Nullable.t

type ‘a callback = error -> ‘a -> unit

type context

type (‘a, ‘b) lambda_function = ‘a -> context -> ‘b callback -> unit

A callback is a function that takes two arguments, an error, which can either be null or an exception, a result of universal type ‘a and returns nothing. This is the usual callback paradigm in asynchronous Javascript code where, if the error argument is null , the function is assumed to have executed successfully and possibly returned a value as its second argument.

Finally, the type for the Lambda function takes an input parameter of universal type ‘a , a context variable and a callback that returns a value of universal type ‘b . This function returns nothing or, if you prefer, its returned value is ignored.

Now, anticipating on what we’ll see later, it turns out that, when interfaced with AWS’s API Gateway through serverless, the returned value of a API handler should be a Javascript object of this form:

{statusCode: <int>, body: <string>}

In this case, we can amend our type declaration above and add:

type api_response = <

statusCode: int;

body: string

> Js.t type 'a api_handler = (‘a, api_response) lambda_function

Now, each time that we declare a Lambda function to be of the type api_handler we will be guaranteed that this function returns the appropriate type of Javascript object.

The Asynchronous Monad

Another typical hassle in Javascript/Node.js world is the fact that one has to constantly deal with callback-based computations. Usually, this looks like this:

function asynchronousProcessing(callback) {

doSomeAsynchronousWork(function (err, result) {

if (err) { return callback(err); } doMoreAsynchronousWork(result, function (err, newResult) {

if (err) { return callback(err); } <etc etc..>

})

});

Having to repeat that pattern constantly is prone to making mistake. In order to make this less painful, APIs such as Promise have been introduced. However, none of those really solve the problem of having to repeat the same pattern over and over again. Here comes the notion of Monad!

Monads is quite a hip term with deep connections to Category Theory but, really, it is just a fancy way to describe a computing abstraction. Essentially, a monad describes a certain type of computation and how to combine them together. Let’s look at an example applied to our case:

(* A asynchronous computation takes a callback and executes it when it has finished, either with an error or with a result of type 'a. *)

type error = exn Js.Nullable.t

type 'a callback = error -> 'a -> unit

type 'a t = 'a callback -> unit (* A computation that returns a result of type 'a. *)

val return : 'a -> 'a t (* A computation that returns an error. *)

val fail : exn -> 'a t (* Combine two computations. *)

val (>>) : 'a t -> ('a -> 'b t) -> 'b t

Now, let’s look at the implementation of this monad:

let return result = fun callback ->

callback Js.Nullable.null result let fail error = fun callback ->

callback (Js.Nullable.return error) (Obj.magic Js.Nullable.null) let (>>) current next = fun callback ->

current (fun err result ->

match Js.toOption err with

| Some exn ->

fail exn callback

| None ->

next ret callback)

The return function takes a callback and executes it with null for its error parameter and a given result . The fail function takes a callback and executes it with the given error and null result. We use a little trickery here with Obj.magic here to pass the result.

Finally, (>>) implements the usual pattern to combine asynchronous computations: given a callback , a current computation and the next one, it executes the current computation with a new callback which, if passed a non- null error, returns immediately by executing the original callback with that error or, else, passes result to the next computation, along with the original callback .

Now let’s look how the code above rewrites using this asynchronous monad:

(* Define the asynchronous processing pipeline *)

asynchronous_computation =

do_some_asynchronous_work >> fun result ->

do_more_asynchronous_work result >> fun (..) ->

(etc etc..) (* Execute it! *)

asynchronous_computation callback

No more repeated patterns, no more potential errors!

Polymorphic variants and phantom types

Finally, let’s look at a more tricky and fancier application of the OCaml type system using phantom types and polymorphic variants.

Phantom types are parametric types that are used to constrain sub-classes of an generic type for various API restriction purposes. This a very powerful tool. For instance, it makes it possible to force a API user to follow a pre-defined flow in its calls such as calling init followed by config and finally run . Here, we are going to use them to annotate the different events that the user can listen to on Node.js’s readable and writable streams.

In order to do so, we use polymorphic variants, another power feature of the language which can be quite handy but sometimes requires a bit of work to understand and use properly. Polymorphic variants are collections of different labels that can be sub- or sur-classed. Let’s look at our code to fix the ideas. You might want to also check out the official Node.js Stream API here.

type 'a t type write_events = [

| `Close of unit -> unit

| `Drain of unit -> unit

| `Finish of unit -> unit

| `Error of exn -> unit

| `Pipe of readable -> unit

| `Unpipe of readable -> unit

] and read_events = [

| `Close of unit -> unit

| `End of unit -> unit

| `Readable of unit -> unit

| `Data of string -> unit

| `Error of exn -> unit

] and writable = write_events t

and readable = read_events t type events = [write_events | read_events] val on : ([< events] as 'a) t -> 'a -> unit

In this API, we define the list of events that readable and writable streams can receive. Each of these events is associated with a handler to process the type of data that will be received when the event occurs.

Then we sub-class the generic stream type ‘a t by annotating it with the events associated with each type of stream, readable and writable . Finally, the on function is typed so it will only accept events that are annotated in the stream type, making sure, for instance, that only a readable stream can be used to listen to the `Data event. Otherwise, the compiler will complain with a message such as:

Stream.on writable (`Data (fun s -> ...))

> Error: This expression has type [> `Data of string -> unit ]

> but an expression was expected of type Stream.write_events

> The second variant type does not allow tag(s) `Data

Now, let’s peek under the hood to see how this is implemented:

external on : 'a t -> string -> ('b -> unit) -> unit = "" [@@bs.send] let on stream = function

| `Close fn -> on stream "close" fn

| `Drain fn -> on stream "drain" fn

| `Finish fn -> on stream "finish" fn

| `Error fn -> on stream "error" fn

| `Pipe fn -> on stream "pipe" fn

| `Unpipe fn -> on stream "unpipe" fn

| `End fn -> on stream "end" fn

| `Readable fn -> on stream "readable" fn

| `Data fn -> on stream "data" fn

Pretty simple after all! First, we declare an external on method on stream objects. It can receive any type of stream, a string -typed event name and a callback receiving a value of a yet unknown type. This is the low-level binding to the Node.js’s EventEmitter API.

Then, we decorate this function by re-defining it as a function that receives a polymorphic event type and handler and explicitly unwraps it down to the expected parameters on the Node.js side. Voila!¹