Wednesday, November 12, 2014 at 8:56AM

This is a guest post by Nikhil Palekar, Systems Architect, FoundationDB.

For many organizations that care a lot about strong consistency and low latency or haven’t already built a fault tolerant application tier on top of their database, adding a multiple data center (MDC) database implementation may create more complexity or unintended consequences than meaningful benefits. Why might that be?

Choosing a database that meets the needs of your application is a significant decision because it’s such a fundamental part of the system you’re building. Without a solid back-end that supports your application and provides meaningful guarantees, the rest of your application is limited in terms of how robust, scalable, and fault-tolerant it can be when stressed.

One way that many developers have started building fault tolerance into their systems is by distributing their database across multiple data centers. “Eventually consistent” systems, which aim to eventually make data consistent across nodes, can be implemented across data centers more easily than databases with “strong consistency” guarantees because they provide limited guarantees about the state of the data they store. However, that decision to distribute the system across multiple, physical locations is not something that should be taken lightly, as it introduces additional complexity into the entire system. If your distributed database is the main database of record for your application, having inconsistent state or lost data probably isn’t a good idea.

Many applications have not been designed with this multi-location, distributed setup in mind at the application level, so they actually don’t need MDC from the database either! There are definitely situations where MDC is helpful and desirable, but there are plenty of situations, some of which are discussed below, in which you may not need MDC capabilities at the database level.

1. Your application tier might not be scalable and fault tolerant

An application stack that fully utilizes the capabilities of an MDC database has fault tolerance built into each level of the system, from the DNS information down to the hardware running the application. At the DNS level, you would want to have multiple “A” records that point to multiple load balancers to ensure that the URL for your application can be resolved to an active load balancer IP even if one of those load balancers failed or can’t be reached. (Obviously fault tolerance at the DNS provider level is helpful too, but as the user, you can’t do too much to ensure that your DNS provider is resilient to failures.)

At the next level down, you would want to ensure that your load balancers are able to scale and handle failures as well. There are a number of ways to approach adding fault tolerance at this level, such as using an “elastic” load balancer that gets scaled up as part of what Amazon calls an “auto scaling” group, for example, which also allows it to handle failures of individual server “instances” to eliminate the load balancer tier as a single point of failure.

Next come the application servers, which receive traffic from the machines acting as load balancers. Just like the load-balancer tier, you would want at least a couple of application servers running your application to ensure that you’re able to service application requests even if one or more application instances experience failures.

Finally, you arrive at the database tier, where there’s another challenge. Even though you need to ensure that you have fault tolerance when facing individual machine failures, you also need to ensure that your database can be an actual “source of truth” for your application, regardless of the node that failed. That’s a fairly hard problem to solve, but many modern distributed databases do have the ability to automatically handle machine failures and ensure that data is durably stored and replicated between machines in the cluster so that no data is lost if a machine fails.

Once you have all of that set up, the next step would be to potentially replicate the entire setup across several data centers or availability zones of your hosting/cloud provider to insulate you from failures in a single location. So at a very high level, you need to take what you just set up and replicate it in multiple locations!

This configuration has many different moving parts and could be a challenge to manage, but it’s also far more expensive than running an application in a single data center with more basic fault tolerance. Unless you’re willing to devote resources to set up and manage this infrastructure and have customers demanding that your application not go down due to an unexpected outage on a fairly reliable platform like Amazon, the MDC capabilities of your database may not matter all that much and don’t provide better service to your customers than a single-data center solution (for your entire application stack) would.

2. Latency matters a lot

Different databases require different amounts of communication between nodes and between data centers in response to database operations. A MDC configuration of a transactional, strong consistency system requires communication between the database machines before a transaction can be committed, so you have to build in that latency cost when evaluating the system’s performance. That latency cost doesn’t have to be huge if the ping time between the data centers is fairly small, but it is an additional cost that impacts the overall performance of the system. If latency is the biggest concern and you need a MDC system, you should consider whether a potential latency savings of an eventually consistent system outweigh the limited data guarantees those systems provide. It’s worth noting that with this approach you’re essentially giving up the ability to say that the data across the multiple nodes and multiple data centers is actually consistent.

That difference in consistency is a big deal for an operational data store that’s being relied upon by the application as a “source of truth” regarding data processed. Of course, if you’re using a different source of truth for your application, you’re talking about another system that you have to worry about scaling and breaking—if that database is only running on a single system or in a single data center, now you have a single point of failure that could impact the overall availability and reliability of your entire application.

An eventually consistent database working in MDCs is a design decision that introduces risks and failure modes as a tradeoff. Whether that tradeoff is worth it depends on your preference for dealing with failures and the importance of latency.

3. "MDC" means worldwide, synchronously connected distributed data centers

Moving data between data centers across the world by itself isn’t a huge problem. However, when you want to provide a consistent data record across the entire system and provide read and write access to any of those systems concurrently, it gets much more challenging.

For example, imagine if two users who connect to geographically diverse data centers both want to update a particular piece of data in the database. Both transactions can’t succeed by talking with a local data center in total isolation from the other data centers, and any changes to the database also need to be replicated to all data centers in some way before a transaction could be considered “committed” to ensure that it’s durable and non-conflicting. So at that point, you’re paying geographic latency penalties on every commit and have the potential for slower conflict resolution because transactions processed by the system need to be reconciled before each transaction can be committed.

Commutative Replicated Data Types (CRDTs) are one way to approach these types of problems with eventually consistent systems but are not a general purpose solution for arbitrary reads and writes to a database because not all operations can be applied commutatively (and thus, don’t work with CRDTs). As mentioned above with respect to latency considerations, if the eventually-consistent model fits your application’s requirements and needs, CRDTs can provide some level of conflict resolution and consistency in place of consensus when performing updates across multiple nodes. Alternatively, if you actually just need local, read access from different data centers, an approach that uses asynchronous replication from your transactional, operational data store to distributed, read-only data centers around the globe may just fit the bill.

In summary

An MDC database deployment can provide valuable benefits to organizations and developers to deal with data center outages, machine failures, and disaster recovery. However, configuring and operating an entire application stack consistently across multiple locations is a non-trivial (and potentially expensive) task that potentially introduces more points of failure than a single-data center deployment.

Deploying the entire stack correctly in a fault-tolerant and robust way is what actually provides the most benefit to developers and operators. As with many technology decisions, there are various tradeoffs that have to be weighed, and this decision about MDC deployment is no different.