It’s day one at your new gig. You’re hopeful that setting up the development environment won’t take too long. How awesome would it be to get some code out to production on your first day?

You’re lucky if this is your first day at IFTTT. Using our Docker-based development environment, you can get set up in a snap. Having a complex codebase configured and ready to use in a few minutes feels amazing. What if I told you that setting up data in your development environment could be that easy too?

The Data Dilema

Managing data across environments is a hard task, especially when your product relies on user generated content. As your user base grows and your product becomes more complex you start seeing a variety of new patterns in user data. Replicating these patterns in development becomes harder and harder.

A specific scenario on IFTTT is the addition of new Channels to our Developer Platform. Not every engineer owns all the new cool IoT devices or has signed up to all 235 services on IFTTT.

Here’s where the trouble starts. Engineers will have to find ways to simulate user data into their development environment. Which will probably lead to some well known bugs:

Some UI elements will start to look weird in development

Pages will look empty

Engineers will start overlooking some specific validation scenarios (in forms, URLs, usernames…)

The worst consequence of all of this: engineers will begin to want to test things in production. 😱

Fixing Data in a 😞 Rails Development Environment

Your development experience may already be looking pretty sad at this point. Good engineers will try to come up with ideas to fix these problems. Here are some of the most common fixes:

Test things in production (not good)

Build a staging environment

Rails database migrations

rake:db:seed

Custom CSVs or Google Spreadsheets

The problem with these solutions is that basically none of them actually solve the problem of re-creating real world data in any environment. For example, creating a staging environment is no guarantee that you’ll be generating good user data, it’s just moving the problem to a different layer.

To solve this problem we created Polo.

Sample Database Snapshots

Polo is the tool we use to generate snapshots of our production database and export them to .sql files our developers can import on any environment. Named after Marco Polo, the famous explorer. We use Polo to explore our data model and return a representation of what it finds along the way.

Here are a few examples from the Polo GitHub repository.

Given an ActiveRecord::Base class and a record_id:

Polo.explore(MyActiveRecordModel, record_id)

Polo will transform that record into a SQL INSERT statement:

INSERT INTO `my_table` (`id`, `name`) VALUES (1, 'Netto')

The real benefits start to show when you teach your data model to Polo, allowing the library to navigate your database’s dependency tree. Fear not, this is simpler than it sounds.

Given the following data model represented by ActiveRecord::Associations:

class Chef < ActiveRecord::Base

has_many :recipes

has_many :ingredients, through: :recipes

end



class Recipe < ActiveRecord::Base

has_many :recipes_ingredients

has_many :ingredients, through: :recipes_ingredients

end



class Ingredient < ActiveRecord::Base

end



class RecipesIngredient < ActiveRecord::Base

belongs_to :recipe

belongs_to :ingredient

end

You can tell Polo to export an ActiveRecord record’s dependencies:

inserts = Polo.explore(Chef, 1, :recipes)

Will return:

INSERT INTO `chefs` (`id`, `name`) VALUES (1, 'Netto') INSERT INTO `recipes` (`id`, `title`, `num_steps`, `chef_id`) VALUES (1, 'Turkey Sandwich', NULL, 1) INSERT INTO `recipes` (`id`, `title`, `num_steps`, `chef_id`) VALUES (2, 'Cheese Burger', NULL, 1)

Or even tell it to load complex ActiveRecord associations:

Polo.explore(Chef, 1, :recipes => :ingredients)

And get back INSERTs for every object Polo visited on its journey:

...

INSERT INTO `recipes` (`id`, `title`, `num_steps`, `chef_id`) VALUES (1, 'Turkey Sandwich', NULL, 1)



INSERT INTO `recipes_ingredients` (`id`, `recipe_id`, `ingredient_id`) VALUES (1, 1, 1)



INSERT INTO `recipes_ingredients` (`id`, `recipe_id`, `ingredient_id`) VALUES (4, 2, 4)

...

INSERT INTO `ingredients` (`id`, `name`, `quantity`) VALUES (1, 'Turkey', 'a lot')

...

INSERT INTO `ingredients` (`id`, `name`, `quantity`) VALUES (4, 'Cheese', '2 slices')

As long as you stick to this beautiful API we borrowed from ActiveRecord::Associations::Preloader, you should be able to define complex object associations and get back imports for every record involved.

Polo is Awesome!

We were pleasantly surprised when we discovered that we could to cover a wide range of scenarios and generate a solid database snapshot with just 5 of our most active users as seeds. YMMV of course, but finding the right seeds and dependency graph for Polo shouldn’t be hard.

Our developers are much happier now, and everyone starts with nice looking data in development on their first day at IFTTT. Frontend Engineers can build great user experiences with much more confidence these days, and Backend Engineers find it easier to deal with complex data tasks.

IFTTT is always looking for talented engineers who are passionate about creating fantastic experiences for users and developers alike. Check out our jobs page at IFTTT if you’re interested in working on super cool tools like Polo.

You can read about some more advanced features of Polo, like obfuscation and blacklisting of attributes, on GitHub.