Dear internet,

Today we have screwed up by applying a broken migration to the running production service and causing a massive outage for several hours… Because the rollback function was terribly broken as well.

As a result, we had to restore a backup that was made several hours ago, losing some new data.

The easiest answer is just to say: “Because it is X’s fault! He is the author of this migration, he should learn how databases work”. But, it is counterproductive.

Instead, as a part of our “Blameless environment” culture, we tend to put all the guilt on the CI. It was the CI who put the broken code into the master branch. So, we need to improve it!

We always write post-mortems for all massive incidents that we experience. And we write regression tests for all bugs, so they won’t happen again. But, this situation was different, since it was a broken migration that worked during the CI process, and it was hard or impossible to test with the current set of instruments.

So, let me explain the steps we took to solve this riddle.

We use a very strict django project setup with several quality checks for our migrations:

We write all data migration as typed functions in our main source code. Then we check everything with mypy and test as regular functions We lint migration files with wemake-python-styleguide , it drastically reduces the possibility of bad code inside the migration files We use tests that automatically set up the database by applying all migrations before each session We use django-migration-linter to find migrations that are not suited for zero-time deployments And then we review the code by two senior people Then we test everything manually with the help of the review apps

And somehow it is still not enough: our server was dead.

When writing the post-mortem for this bug, I spotted that data in our staging and production services were different. And that’s why our data migration crushed and left one of the core tables in the broken state.

So, how can we test migrations on some existing data?

That’s where django-test-migrations comes in handy.

The idea of this project is simple:

Set some migration as a starting point Create some model’s data that you want to test Run the new migration that you are testing Assert the results!

Let’s illustrate it with some code samples. Full source code is available here.

Here’s the latest version of our model:

class SomeItem ( models . Model ): """We use this model for testing migrations.""" string_field = models . CharField ( max_length = 50 ) is_clean = models . BooleanField ()

This is a pretty simple model that serves only one purpose: to illustrate the problem. is_clean field is related to the contents of string_field in some manner. While the string_field itself contains only regular text data.

Imagine that you have a data migration that looks like so:

def _is_clean_item ( instance : 'SomeItem' ) -> bool : """ Pure function to the actual migration. Ideally, it should be moved to ``main_app/logic/migrations``. But, as an example it is easier to read them together. """ return ' ' not in instance . string_field def _set_clean_flag ( apps , schema_editor ): """ Performs the data-migration. We can't import the ``SomeItem`` model directly as it may be a newer version than this migration expects. We are using ``.all()`` because we don't have a lot of ``SomeItem`` instances. In real-life you should not do that. """ SomeItem = apps . get_model ( 'main_app' , 'SomeItem' ) for instance in SomeItem . objects . all (): instance . is_clean = _is_clean_item ( instance ) instance . save ( update_fields = [ 'is_clean' ]) def _remove_clean_flags ( apps , schema_editor ): """ This is just a noop example of a rollback function. It is not used in our simple case, but it should be implemented for more complex scenarios. """ class Migration ( migrations . Migration ): dependencies = [ ( 'main_app' , '0002_someitem_is_clean' ), ] operations = [ migrations . RunPython ( _set_clean_flag , _remove_clean_flags ), ]

And here’s how we are going to test this migration. At first, we will have to set some migration as a starting point:

old_state = migrator . before (( 'main_app' , '0002_someitem_is_clean' ))

Then we have to get the model class. We cannot use direct import from models because the model might be different, since migrations change them from our stored definition:

SomeItem = old_state . apps . get_model ( 'main_app' , 'SomeItem' )

Then we need to create some data that we want to test:

# One instance will be `clean`, the other won't be: SomeItem . objects . create ( string_field = 'a' ) # clean SomeItem . objects . create ( string_field = 'a b' ) # contains whitespace, is not clean

Then we will run the migration that we are testing and get the new project state:

new_state = migrator . after (( 'main_app' , '0003_auto_20191119_2125' )) SomeItem = new_state . apps . get_model ( 'main_app' , 'SomeItem' )

And the last step: we need to make some assertions on the resulting data. We have created two model instances before: one clean and one with the whitespace. So, let’s check that:

assert SomeItem . objects . count () == 2 # One instance is clean, the other is not: assert SomeItem . objects . filter ( is_clean = True ). count () == 1 assert SomeItem . objects . filter ( is_clean = False ). count () == 1

And that’s how it works! Now we have an ability to test our schema and data transformations with ease. Complete test example:

@ pytest . mark . django_db def test_main_migration0002 ( migrator ): """Ensures that the second migration works.""" old_state = migrator . before (( 'main_app' , '0002_someitem_is_clean' )) SomeItem = old_state . apps . get_model ( 'main_app' , 'SomeItem' ) # One instance will be `clean`, the other won't be: SomeItem . objects . create ( string_field = 'a' ) SomeItem . objects . create ( string_field = 'a b' ) assert SomeItem . objects . count () == 2 assert SomeItem . objects . filter ( is_clean = True ). count () == 2 new_state = migrator . after (( 'main_app' , '0003_auto_20191119_2125' )) SomeItem = new_state . apps . get_model ( 'main_app' , 'SomeItem' ) assert SomeItem . objects . count () == 2 # One instance is clean, the other is not: assert SomeItem . objects . filter ( is_clean = True ). count () == 1

By the way, we also support raw unittest cases.

Don’t be sure about your migrations. Test them!

You can test forward and rollback migrations and their ordering with the help of django-test-migrations . It is simple, friendly, and already works with the test framework of your choice.

I also want to say “thank you” to these awesome people. Without their work it would take me much longer to come up with the working solution.