Recently there have been two fairly high profile field reports on MongoDB that show it in a very unfavorable light. The majority of the criticism centers on a combination of performance problems and data loss. Before we continue it should be noted that these are not formal case studies. Rather they are field reports by development teams that used MongoDB in the recent past.

We’ll start with the report by Michael Schurter of Urban Airship. Urban Airship had already experienced problems with MongoDB and had migrated most of their data back to PostgreSQL prior to writing this report. The data the left behind seemed ideal for MongoDB. He describes it as:

Ephemeral – if we lose it we experience service degradation for a short while, but nothing catastrophic

Small – easily fits into memory (~15 GB)

Secondary index – In a key/value store we would have had to manage a secondary index manually

The first problem cited by Michael was MongoDB’s inability to utilize multiple cores on their primary server. Though it had 16 available, the global write lock effectively limited them to a single core. In an attempt to work around this limitation they attempted to use multiple mongods on the server. A mongod is “the primary database process that runs on an individual server”. The use of multiple mongods involves sharding, which brings a lot of complexity of its own. These include

Exactly three mongod config servers

One mongos router per application server

Arbiters, at least three per replication set

The mongods themselves

The mongos routers were quiet effective at balancing data. However they experienced the occasional crash, an issue that 10gen believes is fixed in the recently released v. 2.0.1. Another problem cited by Michael is that “mongos instances can use a lot of CPU and seem to have random spikes where they fully utilize every core very briefly”.

The problem with the mongods is that bringing up new replication members generally means suspending all write operations for a period of time. Michael writes, “Even 40 updates per second was enough of a trickle to prevent a new set member from leaving RECOVERING to becoming a SECONDARY. We had to shutdown mongoses to cut off all traffic to bring up a new member.” There also appears to be a bug specific to the Java client that won’t allow it to complete any operation while there are mongods still in recovery mode.

Michael concludes with,

Right now we’re getting by with 2 shards on 2 dedicated servers and then mongoses and config servers spread throughout other servers. There appears to be some data loss occurring, though due to the ephemeral fast changing nature of this dataset it’s very difficult to determine definitively or reproduce independently. So we’re trying to migrate off of MongoDB to a custom service better suited for this dataset ASAP.

Our second field report comes from an anonymous source and is titled Don’t use MongoDB. The author claims to have used it on a large userbase with “10s of millions of users” at a “high profile company”. If this is true then there are certainly valid reasons for the author to want to remain anonymous. Still we wouldn’t be reporting on this if not for the collaborating report by Michael Schurter.

The first issue is well known, MongoDB’s default configuration is to asynchronous writes. While most people are aware of the immediate ramifications of not knowing whether or now a write operation succeeded. A more subtle problem mentioned by the author is that there is no way of knowing when a write operation completes. If using connection pooling this may lead one to attempt to read information before it has been committed.

Data loss is a serious issue for the author. He cites the following ways in which data was lost:

Here is a list of ways we personally experienced records go missing:

They just disappeared sometimes. Cause unknown. Recovery on corrupt database was not successful, pre transaction log. Replication between master and slave had *gaps* in the oplogs, causing slaves to be missing records the master had. Yes, is no checksum, and yes, the replication status had the slaves current Replication just stops sometimes, without error. Monitor replication status!

Like Michael he cites problems caused by MongoDB’s global write lock. He also had problems adding shards to an operational system, claiming “Adding a shard under heavy load is a nightmare. Mongo either moves chunks between shards so quickly it DOSes the production traffic, or refuses to more chunks altogether.”

Monogos reliability was once again brought up,

The mongod/config server/mongos architecture is actually pretty reasonable and clever. Unfortunately, mongos is complete garbage. Under load, it crashed anywhere from every few hours to every few days. Restart supervision didn't always help b/c sometimes it would throw some assertion that would bail out a critical thread, but the process would stay running. Double fail. It got so bad the only usable way we found to run mongos was to run haproxy in front of dozens of mongos instances, and to have a job that slowly rotated through them and killed them to keep fresh/live ones in the pool. No joke.

After mentioning the old 1.6 bug where replica sets would sometimes choose the wrong node and cause massive data loss, the author then cites problems with replication on busy servers:

Replication would often, again, either DOS the master, or replicate so slowly that it would take far too long and the oplog would be exhausted (even with a 50G oplog). We had a busy, large dataset that we simply could not replicate b/c of this dynamic. It was a harrowing month or two of finger crossing before we got it onto a different database system.

The author concludes with accusing 10gen of being more concerned with “raw req/s per resource” than preventing data loss and ensuring availability.

10gen’s Response

Eliot Horowitz, CTO of 10gen, took the time to respond to the complaints outlined in the anonymous report. He begins by stating,

First, I tried to find any client of ours with a track record like this and have been unsuccessful. I personally have looked at every single customer case that’s every come in (there are about 1600 of them) and cannot match this story to any of them. I am confused as to the origin here, so answers cannot be complete in some cases.

In regards to the first four data loss issues he wrote,

There has never been a case of a record disappearing that we either have not been able to trace to a bug that was fixed immediately, or other environmental issues. If you can link to a case number, we can at least try to understand or explain what happened. Clearly a case like this would be incredibly serious, and if this did happen to you I hope you told us and if you did, we were able to understand and fix immediately. This is expected, repairing was generally meant for single servers, which itself is not recommended without journaling. If a secondary crashes without journaling, you should resync it from the primary. As an FYI, journaling is the default and almost always used in v2.0. Do you have the case number? I do not see a case where this happened, but if true would obviously be a critical bug. If you mean that an error condition can occur without issuing errors to a client, then yes, this is possible. If you want verification that replication is working at write time, you can do it with w=2 getLastError parameter.

Eliot agrees that shards cannot be added once a MongoDB system reaches capacity and says that work is being done to reduce the impact of the global write lock.

As for the rest of the issues, Eliot maintains that he has never heard of them and that he would like to know more details. He also stresses that their defect tracking system is publically visible.

The user “harryh” claims to be an engineer at the popular site Foursquare supports him.

About a year and a half ago my colleagues and I and made the decision to migrate to MongoDB for our primary data store. Currently we have dozens of MongoDB instances across several different data clusters storing over a TB of data and handling 10s of thousands of requests per second (mostly reads but the write load is reasonably high as well).

Follow-up by the Anonymous Writer