Ara Howard has discovered that the ActiveRecord validation mechanism does not ensure data integrity.1 Validations feel a bit like database constraints but it turns out they are really only useful for producing human friendly error messages.

This is because the assertions they define are tested by reading from the database before the changes are written to the database. As you will no doubt recall, phantom reads are not prevented by any isolation mode other than serializable. So unless you are running your database in serializable isolation mode (and you aren’t because nobody does) that means that the use of ActiveRecord validations setup a classic race condition.

On my work project we found this out the hard way. The appearance of multiple records should have been blocked by the validations was a bit surprising. In our case, the impacted models happened to be immutable so we only had to solve this from for the #find_or_create case. We ended up reimplementing #find_or_create so that it does the following:

do a find 2. if we found a matching record return the model object if it does not exist create a savepoint insert the record 4. if the insert succeeds return the new model object if the insert failed roll back to the savepoint re-run the find and return the result

This approach does requires the use of database constraints but, having your data integrity constraints separated from the data model definition has always felt a bit awkward. So I think this more of a feature than a bug.

It would be really nice if this behavior were included by default in ActiveRecord. A similar approach could be used to remove the race conditions in regular creates and saves by simply detecting the insert or update failures and re-executing the validations. This would not even require that the validations/constraints be duplicated. The validations could, in most cases, be generated mechanically from the database constraints. For example, DrySQL already does this.

Such an approach would provide the pretty error messages Rails users expect, neatly combined with the data integrity guarantees that users of modern databases expect.