Feral Concurrency Control: An Empirical Investigation of Modern Application Integrity – Bailis et al. 2015

This paper is an absolute joy to read: seasoned database systems researchers conduct a study of real-world applications from the Ruby community and try not to show too much disdain at what they find, whilst pondering what it might all mean for the underlying data management systems that power these applications and having the grace to accept the database systems share some of the blame. And (if you’re not a Rails developer) before you start getting too smug that this wouldn’t happen in your favourite language / framework they also show that the same issues arise with other combinations in Java, PHP, Python, and Javascript.

As a lens for understanding this modern ORM behavior, we study Ruby on Rails (or, simply, “Rails”), a central player among modern frameworks powering sites… this wildly successful software framework bears an actively antagonistic relationship to database management systems, echoing a familiar refrain of the “NoSQL” movement: get the database out of the way and let the application do the work.

What are the consequences of this impedance mismatch between databases and modern ORM frameworks?

By shunning decades of work on native database concurrency control solutions, Rails has developed a set of primitives for handling application integrity in the application tier—building, from the underlying database system’s perspective, a feral concurrency control system. We examine the design and use of these feral mechanisms and evaluate their effectiveness in practice by analyzing them and experimentally quantifying data integrity violations in practice. Our goal is to understand how this growing class of applications currently interacts with database systems and how we, as a database systems community, can positively engage with these criticisms to better serve the needs of these developers.

67 open source applications built on Ruby on Rails with Active Record are studied, with an average of 27kloc per project. Rails has four main mechanisms for concurrency control:

Transactions : a sequence of operations can be wrapped in a transaction block and will execute in a database transaction – as of Rails 4.0.0 the isolation level can be controlled on a per-transaction basis. Optimistic and pessimistic record locking. Pessimistic locks are based on a SELECT for UPDATE, optimistic locks rely on a special field in the Active Record model. Application level validations – supporting both pre-defined and user-defined validation functions. Each declared validation is run sequentially, and if they all pass the record is updated in the database. Application level associations – which act like foreign key constraints but are maintained in the application layer.

Until the release of Rails 4.2 in December 2014, Rails did not provide native support for database-backed foreign key constraints. In Rails 4.2, foreign keys are supported via manual schema annotations declared separately from each model; declaring an association does not declare a corresponding foreign key constraint and vice-versa.

The application level constraints (validations and associations) are the promoted means of concurrency control:

Rails’s feral mechanisms—validations and associations—are a prominent feature of the Active Record model. In contrast, neither transactions nor locks are actually discussed in the official “Rails Guides,” and, generally, are not promoted as a means of ensuring data integrity. Instead, the Rails documentation [7] prefers validations as they are “are database agnostic, cannot be bypassed by end users, and are convenient to test and maintain.”

Accordingly, when the real-world applications are examined, validations and associations dominate:

Perhaps most notable among these general trends, we find that validations and associations are, respectively, 13.6 and 24.2 times more common than transactions and orders of magnitude more common than locking. These feral mechanisms are—in keeping with the Rails philosophy—favored by these application developers. That is, rather than adopting the use of traditional transactional programming primitives, Rails application writers chose to instead specify correctness criteria and have the ORM system enforce the criteria on their behalf… Given that these criteria are nevertheless being declared by application writers and represent a departure from traditional, transaction-oriented programming, we devote much of the remainder of this work to examining exactly what they are attempting to preserve (and whether they are actually sufficient to do so).

As well as looking at usage in aggregate, “it’s also interesting to study individual applications,” which leads to this “one might expect” gem:

Spree uses only six transactions, one for each of 1.) canceling an order, 2.) approving an order (atomically setting the user ID and timestamp), 3.) transferring shipments between fulfillment locations (e.g., warehouses), 4.) transferring items between shipments, 5.) transferring stock between fulfillment locations, and 6.) updating an order’s specific inventory status. While this is a reasonable set of locations for transactions, in an eCommerce application, one might expect a larger number of scenarios to require transactions, including order placement and stock adjustment.

and …

The remainder of the application corpus contains a number of such fascinating examples, illustrating the often ad-hoc process of deciding upon a concurrency control mechanism.

So, are application level validation and association specified constraints safe?

To begin, recall that each sequence of validations (and model update as well, if validations pass) is wrapped within a database-backed transaction, the validation’s intended integrity will be preserved provided the database is using serializable isolation. However, relational database engines often default to non-serializable isolation; notably for Rails, PostgreSQL and MySQL actually default to, respectively, the weaker Read Committed and Repeatable Read isolation levels. We did not encounter evidence that applications changed the isolation level. Rails does not configure the database isolation level for validations, and none of the application code or configurations we encountered change the default isolation level, either (or mention doing so in documentation).

Given that validations are not likely to be perfectly isolated, Bailis et al. use invariant confluence analysis to figure out which invariants can be preserved under coordination free concurrent execution. Of the ten most popular invariants by usage the most popular, presence, is safe under insertions but not under deletions. The second most popular invariant, uniqueness, is not I-confluent: “that is, if two users concurrently insert or modify records, they can introduce duplicates.”

Overall, a large number of built-in validations are safe under concurrent operation. Under insertions, 86.9% of built-in validation occurrences as I-confluent. Under deletions, only 36.6% of occurrences are I-confluent. However, associations and multi-record uniqueness are—depending on the workload—not I-confluent and are therefore likely to cause problems.

Uniqueness validations would be safe under serialization execution – but Oracle doesn’t support this (it’s strongest isolation level is snapshot isolation), and the PostgresQL serializable isolation level (as of March 2015) contained a confirmed bug that allowed duplicates.

The Rails documentation warns that uniqueness validations may fail and admit duplicate records. Yet, despite the availability of patches that remedy this behavior by the use of an in-database constraint and/or index, Rails provides this incorrect behavior by default.

In an experiment, the authors were easily able to create duplicates with a test application. The uniqueness constraint reduced the number of duplicates by an order of magnitude compared to the same run without the constraint in place, but still permitted many duplicates (700) to be created. With a less pathological workload, the probability of duplicates depends on the distribution of keys – but it is always there.

Feral association validation mechanisms can also lead to violations:

Given that entirely feral mechanisms can introduce broken associations, how many dangling records can be introduced? Once a record is deleted, any later validations will observe it via SELECT calls. However, in the worst case, the feral cascading deletion on the one side of a one-to-many relation can stall indefinitely, allowing an unlimited number of concurrent insertions to the many side of the relation. Thus, validations—at least theoretically—only reduce the worst-case number of dangling records that were inserted prior to deletion; any number of concurrent insertions may occur during validation, leading to unbounded numbers of dangling records.

This was also demonstrated in a sample application.

The preceding experiments demonstrate that, indeed, Active Record is unsafe as deployed by default. Validations are susceptible to data corruption due to sensitivity to weak isolation anomalies…. It is possible that, in fact, the degree of concurrency and data contention within Rails-backed applications simply does not lead to these concurrency races—that, in some sense, validations are “good enough” for many applications. Nevertheless, in both cases, Rails’s feral mechanisms are a poor substitute for their respective database counterparts—at least in terms of integrity.

But what is a poor application developer to do? The choice is between ACID transactions with weak and hard to understand isolation models, or custom feral enforcement in the application/framework that is “an expensive, error-prone, and difficult process neglecting decades of contributions from the database community. ”

We believe application users and framework authors need a new database interface that will enable them to:

Express correctness criteria in the language of their domain model, with minimal friction, while permitting their automatic enforcement. Only pay the price of coordination when necessary Easily deploy to multiple database backends