Those of you who use Pusher will know that the APIs we expose are small and straightforward to use. But this hides the complexity of what is going on behind the scenes. The sheer volume of messages (around 5 billion messages a day and approaching 1.25 trillion since we started counting) has necessitated a large, distributed system, which in turn leads to many potential points of failure. Because of this we have a number of black-box, or integration, tests, which check the external features of our API work as documented.

With the introduction of upcoming features to the APIs, we were planning on adding new integration tests. The problem was that the existing Ruby tests suffered from duplication as well as being heavily callback driven – making them hard to follow. Here is an example of what one of these tests looked like.

We re-wrote the entire integration test harness in Haskell to enable tests to be written in a terse, declarative, linear style. Haskell’s emphasis on composability and good support for concurrency made it a great fit for this task. We are also embracing Haskell in other areas of Pusher and saw this as a good opportunity to get our feet wet.

Drawbacks of the existing tests

Logical duplication in the tests had led to duplication in the code

The reason for the duplication comes from the nature of what we are testing. In order to test something, say checking a WebHook is sent when a user goes offline and is automatically unsubscribed from a presence channel, there are multiple common steps that must be performed:

Connect to WebSockets Assert connection to Pusher established Subscribe to a channel Assert subscription succeeded Disconnect from channel Assert WebHook is received

However, there are other integration tests that also require a some of the same steps to be run. For example, testing that a WebSocket message is received – such as the pusher:subscription_succeeded event – also requires all but the final step from the above list to be performed. We wanted to factor these common components out into functions that could be composed together to build a test.

Asynchronous code is hard to follow

Another change we wanted to make over the existing tests was to move away from the heavily asynchronous event-based structure of the existing tests. The code to receive WebSocket messages as well as WebHooks was performed via callbacks, so it was difficult to understand the test simply by reading it from top to bottom. We wanted to move away from this structure, but we also had to deal with the fact that WebHooks could arrive concurrently with other tests. The example I referred to earlier demonstrates the complexity.

What our solution looks like

The composability of the test components turned out to be a major win, and will make additional tests much easier to write in the future. Before jumping into the details, let’s take a look at what they look like. Hopefully what they are testing should be intuitively clear.

Here is an example of two of the tests that check the HTTP API is working:

message = test ( genSingletonChanList "" ) ( subscribe >> assertSubSucceeded >> startRoundtripTimer >> sendAPIMsg >> assertRecievedWSMessage >> stopRoundtripTimer ) channelExistence = test ( genSingletonChanList "" ) ( subscribe >> assertSubSucceeded >> startRoundtripTimer >> assertChannelsExist "" >> stopRoundtripTimer )

And here are a couple of more complicated tests that check WebHooks are correctly sent when a channels is vacated:

webhookChanVacated = test ( genSingletonChanList "" ) ( subscribe >> assertSubSucceeded >> startRoundtripTimer >> assertRecievedWebHook "channel_occupied" >> stopRoundtripTimer >> unsubscribe >> assertRecievedWebHook "channel_vacated" ) webhookMemberRemoved = test ( genSingletonChanList "presence-" ) ( subscribeWithAuth >> startRoundtripTimer >> assertRecievedWebHook "member_added" >> stopRoundtripTimer >> unsubscribe >> assertRecievedWebHook "member_removed" )

test is a function that is part of our integration framework. It takes a function to set up the dependencies of the test, and a function that actually runs the test (handling WebSocket connections and disconnections).

This should give an idea of how the same components can be plugged into a test wherever they are needed.

Declarative, composable and terse

We are very satisfied with the simplicity and intuitive appearance of these test definitions, despite all the complicated things going on behind the scenes! This is a significant improvement over the Ruby tests, both in terms of the small size and also the ease with which they can be extended.

How Haskell lets us do this (monad warning)

While cringing at the cliche: monads were the key ingredient. If you haven’t come across monads before, don’t worry, you can think of the monad we are using as “code that does IO”. Each of the components that makes up a test is just a function that does IO, and therefore results in a monadic value. Haskell provides operators >> in the above examples) for chaining these monadic functions together into new functions. This allows us to build our tests components up into the tests themselves (which is really just another test component!).

For example, like all test components, sendAPIMsg and receiveWSMessage are both of the IO monad type. >> is a function which takes two monads of the same type, in this case IO, and returns a new IO monad. Thus we have just constructed a new test component, which we could bind to a name like

let sendAndReceiveAPIMsg = sendAPIMsg >> receiveWSMessage

This simple case only scratches the surface of what we were able to do with monads. Using more advanced languages features we were able to extend the standard IO monad with an implicit test environment, logger and error handling. Definitely an interesting topic for future discussion.

But I can do the same in <my language>!

Yes you can. You can chain together a series of functions in almost any language. The nice thing about our Haskell solution is that we are actually composing these test components into a new test component (with >> as shown in the previous section), so this composite component can be used anywhere a test component is expected; it is also a first class function in Haskell, so it can be bound to a name, and passed as a parameter.

Additionally, in an imperative language the test components would likely read configuration from the class or global environment they are operating in. In other words, the functions would have to make assumptions about global state. In Haskell, it is not possible for the body of a function to reference global state, and instead the environment is implicitly passed through the functions – ensuring they work in any context. This would be difficult to accomplish in a language that does not have built-in support for function composition and partial function application; the latter being necessary because the test components require some explicit arguments (e.g. the webhook type), but the environment should come from the return value of the previous test component.

The fact that we get the safety of static type checking and pure functions, while also retaining a level of flexibility comparable to dynamic languages was a major advantage of using Haskell.

Callback spaghetti: gone (thanks STM)

You may also have noticed that the steps are completely linearly described, even though WebHooks may arrive concurrently with other checks. Our solution was to use software transactional memory (STM) to write values to a shared data structure. STM allows these data structures to be modified by atomic transactions that are retried if the transaction failed – in a similar way to database transactions.

For example, deleting an entry in an STM hashmap would look like this:

atomically $ modifyTVar hashmap ( HashMap . delete key )

We had a webserver listening for incoming WebHooks in a child thread. When a webhook arrives, the server writes it to an STM hashmap. The test component checking the webhooks can then read the webhooks from the hashmap, blocking until it has actually arrived.

Where Haskell got in the way

Haskell has a reputation of being very concise while still being expressive. In practice, we found the actual size of our code base to be comparable to the Ruby one. But, it didn’t suffer from duplication and over time we’ll gain better test coverage with less test code. However, whilst most of the logic was shorter, we found that there was a large amount of boilerplate required in decoding/encoding Pusher protocol messages into types. This is not required In a dynamically typed language like Ruby. Having to painstakingly define these types did however lead to a more robust test of the structure of the messages, and provides good documentation of what we expect to be receiving.

Also, the fact that Haskell is a less mainstream language led to a few obvious disadvantages: less documentation, not quite as many libraries, poorer tooling (although having on the fly type errors show up with mod-ghc is already a massive win over Ruby).

Conclusion: it was awesome!

Despite these issues, we believe that overall Haskell has been a great choice. The key benefit was the excellent support for composition, which allowed us to write flexible reusable tests components. The powerful type system gave us great confidence that our tests worked correctly, without prohibiting expressiveness. We also came to appreciate other more advanced language features such as monad transformers, which I will hopefully get a chance to discuss in a future blog post. As for the way we approached things, we are still new to the language, so I’m sure there may be better ways of doing things; I would love to hear suggestions if that’s the case!