What’s the tech?

RethinkDB is an open-source distributed database that stores JSON documents.

Overview

It is both developer and operations oriented - combining an easy-to-use powerful query language with simple controls for operating at scale with high availability. Drawing upon learnings from NoSQL databases like MongoDb, CouchDB, Riak and Cassandra, RethinkDB is focussed on bringing the developer the best of both worlds as a foundation for their applications.

Founded in 2009 RethinkDB came out of YCombinator and is relatively new player on the data storage scene. The database itself has been available for a little over a year now and it is still undergoing heavy development. Their stability page outlines their current status on most common usage patterns of RethinkDB and their stability.

RethinkDB have a great video that introduces the main features here.

Usage

We began using RethinkDB right from the beginning of Workshape.io. It was very much a case of perfect timing. When I began creating the prototype I knew I wanted to use a data storage layer that offered flexibility in terms of schema - NoSQL was definitely the way to go, however I had always been somewhat resistant to adopting MongoDB or CouchDB. Whilst I was looking around for options a link about RethinkDB had just been shared on HackerNews - the website caught my eye and pretty much from there the love affair began.

Their website gives a brilliant introduction. Written in a tutorial style, it is hard to resist spinning up an instance and following the on-screen instructions to get your first taste. I did just that - giving a hands on experience of just how clean and easy it is to use.

From there, whilst in the early stages of developing Workshape.io it just made sense to continue using RethinkDB. Their main selling points around a flexible schema and ease of use were hard to deny - I could just focus on building the application in a fast and agile manner. The browser-based GUI is also particularly impressive - giving you health status of your cluster and a way to query your data.

One thing I had to overcome was my standard mindset of thinking of queries in SQL as opposed to ReQL. I may have leaned quite heavily on their SQL to ReQL cheat sheet in the early days. Now though, I feel very comfortable with the language - it feels incredibly clean and expressive. There are official native API implementations for Javascript, Python and Ruby. This is an aspect I much prefer over SQL.

About 8 months on from initial development, and moving from pet project to start-up, the love affair continues with RethinkDB firmly in place as our data storage layer. We currently operate a single node cluster on a 2GB box on Digital Ocean. Our usage is still quite low at present with only 120,000 documents in DB - but we haven't experienced any issues thus far.

Although we have come quite a long way with RethinkDB we see this moment as just the beginning of our journey and are very excited for reaping the benefits of the tool in the future.

Features for the Future

There are 3 aspects of RethinkDB that really excite us all at Workshape.io:

changes API (realtime push)

(realtime push) horizontal scaling

distributed parallelised queries

The most incredible aspect of all three of these features is that they are so simple to use that the consideration is of adopting such features into our application will be trivial when it comes to finding a suitable use case for them. This is a massive plus for RethinkDB - we can focus on building our app knowing that RethinkDB have reduced previously challenging software problems to a simple to use API that protects us from the complexity, whilst getting all the benefits.

The changes API is introduced in this blog post and is part of the recent version 1.16 release.

We currently use the changes API in combination with Socket.io for our internal analytics dashboard so that it reflects real time information about key metrics of the platform.

The snippet below demonstrates our usage of the changes API.

Live.prototype.monitor = function() { this.db.changes( r.table('shape'), function(err, change) { ... that.socketio.emit('shape-update', change); } ); ... };

Whenever a change occurs RethinkDB emits a message containing two properties old_val and new_val which we forward to to the client using Socket.io. We love how simple it is, and how little we need to write, to push updates to our front-end without the limitations of constantly re-polling the database (in RDBMS systems).

We look forward to blogging about RethinkDB in the future when we move to a multi-node cluster and operate at a larger scale.

Recommendations

With all this being said. It is very important to understand when to consider using a NoSQL database over a SQL/RDBMS database. For some projects it is just not suitable and I encourage anyone who wants to use to RethinkDB to check out their FAQ page before proceeding. I'd also recommend reading this blog post about choosing between the two.

If you find that your project seems to fit in with when it is a good idea to use RethinkDB then my personal recommendation would be - go for it!

Thanks for reading!