A couple of years ago, we moved our code base to a monorepo, which helped us scale tremendously in terms of code reuse and overall speed of development. We are extremely proud of our ability to run a resilient service that has 99.99% availability with zero downtime upgrades.

From the beginning of this journey, I made a decision to go all-in on JavaScript as our default coding language. The most important reason for this was that I wanted to hire full-stack developers who could work on every aspect of the product, so we chose Angular.js for UI, Node.js for API, and a schema-less JSON database (NoSQL MongoDB). We made all technology decisions based on this one philosophy (another blog coming about what I learned and why I am no longer a fan of full-stack development) and it worked beautifully...for a while.

It Started With Small Problems...

Even though we had the ability to add features at a lightning fast pace, we started seeing occasional downtimes which always seemed to come down to MongoDB. For instance:

We were very happy to have 24x7 availability with primary and secondary instances of MongoDB. However, our performance suddenly deteriorated one day and retrieval started taking more than a second per document. We tried using many tools and profilers, but could not figure out what was happening. Finally, we rebuilt a new server, switched that over as primary, and rebuilt our secondary. Retrieval times dropped to 150ms again. This is still an unsolved mystery!

Our Mongo instance reached 4TB and we were proud of our growing adoption. Due to the lack of tooling around managing large DBs, we relied on indexes to keep the search times low. When NoSQL DBs first became popular, there was no way to create uniqueness, so these features were built as an afterthought. Some of the bloating of our MongoDB was actually due to indexes, but rebuilding them was primitive and the entire DB would lock down.

At one point, we needed to reboot our DB server — and it took MongoDB four hours to come back online. This led to an extended downtime for our service, and we had very little visibility into the MongoDB process and status.

And Then Came the Knockout Punch!

The biggest advantage — and disadvantage — of MongoDB is that it has a flexible schema. This means that documents in the same collection (AKA "table" in the old world) do not need to have the same set of fields or structure, and common fields in a collection's documents may hold different types of data. In a nutshell, there are no strict schema rules and this opens it up to a lot of cowboy tinkering.

While many developers love the flexibility, it also puts a very high degree of responsibility on their shoulders to get things right.

For example, let's consider a simple schema that stores information about a Git repository:

Field name added on provider 12/1/2012 repoOrg 12/1/2012 repoName 12/1/2012 isPrivate 7/17/2014 hasTeams 2/23/2016

As you could guess, the schema is updated over a period of time as fields are added to meet new requirements of an evolving product. This means that depending on when a repository was added to this document, it may or may not have the isPrivate and hasTeams fields. Our backend and frontend services needed to handle both cases gracefully, which led to code like this:

if exists(repo.hasTeams) and repo.hasTeams === true { # do something } else { # do something }

In every single place where repo.hasTeams is used, we needed to add this code. With many microservices, many schemas and 40+ developers adding these blocks all over the place, our codebase was starting to look ugly. Also, every time we saw failures across our system, it was always a spectacular crash with no easy way to recover. I would wager that 90% of our crashes were due to the fact that some piece of code expected a field that didn't exist for that document. Internal brainstorming brought up the idea of building a schema validator and all sorts of hacks, but isn't this what a database should provide out-of-the-box?

One big black mark against Mongo (or any equivalent NoSQL database)!

The straw that broke the camel's back was when we introduced a critical field that absolutely needed to be present for each document in our most important collection. To ensure that every document included the field, we had to retrieve every single document one by one, update it, and then put it back. With millions of documents in the collection, this process caused the database performance to degrade to an unacceptable degree and we had to accept 4+ hours of downtime.

And that's when I decided that NoSQL wasn't going to work for us. To each his own, but we were done struggling with it and causing our customers (and ourselves) unnecessary heartache.

Postgres to the Rescue!

After this last incident, which happened about a year ago, we migrated to PostgreSQL. I can explain the step-by-step process of migration in another blog if you're interested. We have no regrets, and the following factors have greatly improved our availability and resiliency:

Postgres has a strongly typed schema that leaves very little room for errors . You first create the schema for a table and then add rows to the table. You can also define relationships between different tables with rules so that you can store related data across several tables and avoid data duplication. All this means someone on the team can act as a database architect and control the schema which acts as a standard for everyone else to follow.

. You first create the schema for a table and then add rows to the table. You can also define relationships between different tables with rules so that you can store related data across several tables and avoid data duplication. All this means someone on the team can act as a database architect and control the schema which acts as a standard for everyone else to follow. You can change tables in PostgreSQL without requiring to lock it for every operation . For example, you can add a column and set the value to NULL quickly without locking the entire table.

. For example, you can add a column and set the value to quickly without locking the entire table. Postgres also supports JSONB, which lets you create unstructured data — but with data constraint and validation functions to help ensure that JSON documents are more meaningful. The folks at Sisense have written a great blog with a detailed comparison of Postgres vs. MongoDB for JSON documents.

— but with data constraint and validation functions to help ensure that JSON documents are more meaningful. The folks at Sisense have written a great blog with a detailed comparison of Postgres vs. MongoDB for JSON documents. Our database size reduced by 10x since Postgres stores information more efficiently and data isn't unnecessarily duplicated across tables.

since Postgres stores information more efficiently and data isn't unnecessarily duplicated across tables. As was shown in previous studies, we found that Postgres performed much better for indexes and joins, and our service became faster and snappier as a result.

We have been very happy with Postgres since we migrated and we are no longer struggling with managing our database. As a result, we have seen our NPS go up significantly as customers are happier with a platform that "always works."