What’s the problem ?

Recently, I was in charge of some heavy migrations in my company and I had to find a solution to make it smooth.

We work with Capistrano, and I couldn’t lock the deployment process for too long, which was the biggest problem. Some of those migrations were structural, and some were data related, updating hundreds thousands to millions of rows in the database.

I ended up writing a migration flow which was a mix of raw SQL, database schema changes through ActiveRecord, and asynchronous workers launched either from the migration itself or some rake tasks. Everything was in a specific order starting from the heavy changes, to the structural, and ending with several consistency check-ups.

It worked, and I usually do that but writing this flow and making it work altogether was a real pain in the ass. I know I cannot avoid this kind of processes and that’s part of the scaling phase of a company, but burning this time to go back to raw SQL queries, writing different types of workers and rake tasks, was something I felt could be partially improved.

I decided to search what solutions are up there, and it seems lots of company just build up their own mechanisms depending of their structure. Small projects are too small to worry about this, bigger companies just write their own system.

But what about the guys in-between ? Not much solutions can be found than what I was already doing.

So I decided to write a gem.

What’s my solution ?

I wrote RailsAsyncMigrations which’s a very simple extension of ActiveRecord::Migration

For now, you can use it with Sidekiq or Delayed::Job which will help split up your data or structure changes while handling well the server load.

It does not require heavy work on your end, since it goes through ActiveRecord, and can be customised pretty easily too.

It’s not for all migrations, but when you need to slowly move data, sanitize or change the database structure without needing the result right after your deployment, this helps quite a lot to ease the coding process.

It works in 5 minutes … I swear !

For this example, I’ll choose Delayed::Job as reference, but this can be changed in literally one line to Sidekiq if your configure in your project.

Make a new project

Open up your console and go in your projects directory, start by typing

rails new my_project

cd my_project

Install Delayed::Job

Then, add Delayed::Job to your project Gemfile and its only dependency

gem 'delayed_job_active_record'

gem 'daemons'

Go back to your console and type those lines to install it completely

bundle install

rails generate delayed_job:active_record

rake db:migrate

Install RailsAsyncMigrations

Everything went good so far ? You can now install RailsAsyncMigrations by adding it to your Gemfile as well

gem 'rails_async_migrations'

You also have to install it. We will add a simple table needed to keep the state of our future asynchronous migrations.

bundle install

rails generate rails_async_migrations:install

rake db:migrate

Use it now !

That’s pretty much it, you’re good to go and can add migrations. Let’s see an example.

rails generate migration "this_is_a_test"

Check out the last file in db/migrate/ and change it this way

class ThisIsATest < ActiveRecord::Migration[5.2]

turn_async def change

create_table 'tests' do |t|

t.string :test

end

end

end

What happens then ? The turn_async keyword will tell ActiveRecord::Migration to use our parallel migration queue instead of just running everything in the same process.

Let’s run it and see.

$ rake db:migrate

But the migration was run ?! Yes, with RailsAsyncMigrations we go through the classical migration run, but the methods such as change , up or down will have their content ignored from the process.

If you rake:db rollback it’ll also be taken away. So you can go in any order or direction and use your migrations commands without worrying about the one you turn asynchronous.

At this point, a row has been added to our table rails_async_migrations and is ready to be run, the starting state is created ; it will go through multiple states.

The whole point of this library is to avoid building up too much on your side for something which’s a pretty simple concept. Under the hood, a lot is happening, but you don’t need to worry about that.

If Delayed::Job is already running on your machine, chances are the migration is already done, since it was pretty simple to process.

In any case, make sure the queue will be run by writing this in your console

$ bin/delayed_job start

It was that simple. All the migrations you turn_async will now be launched via Delayed::Job in this parallel pipe.

What should I use it for ?

Be aware turning your migrations asynchronous will add difficulties; ensuring data consistency when building your migrations is crucial.

This quick example was using ActiveRecord::Migration functionalities but ignores the idempotency nature of workers systems which you will face while using it in bigger projects.

When using this parallel queue, you have to ensure your data alterations can be repeated multiple times, in case the worker is being run again. Adding conditions to see if something was already added, or being careful with the querying does the trick.

I personally use it to alter data which can be slowly updated, without risk of breaking them with multiple run.

MyModel.where(something: true).find_each do |my_model|

my_model.update! something: false

end

What if I’ve got multiple migrations ?

It would be a total mess if all workers were launched at the same time, so there’s a queue, with a specific order of execution, the same way as the synchronous ones.

Once a migration is done , it goes to the next one, if it’s failed , it locks the process and will use the natural retry system of your workers, and eventually pass to the next one.

If there’s failure, check the logs of your worker. You can fix the problem, and l et it try again, or change its state in the database and manually recheck the queue via rake rails_async_migrations:check_queue

What if I mix synchronous and asynchronous migrations together ?

Any migration you don’t turn will act like it should in its original way. You can put turn_async on any migration you want but also keep some synchronous ones in-between, it’ll just move some to the new queue, and keep running the others through the classical process.

What if I want to use Sidekiq ?

Once again, just add up a bit of configuration

RailsAsyncMigrations.config do |config|

config.workers = :sidekiq

end

Extending the principle

The table async_schema_migrations is very straight forward. You can build up a view to check the state of different migrations, and enforce some to be removed or updated without them passing.

If you want to go further with this, or want to ask me some extra feature, don’t hesitate to contact me. This gem is fresh, and I hope people will like it !