The n+1 query problem is one of the most common scalability bottlenecks. If you’re comfortable with Rails, Active Model Serializers and already have a good idea about what our problem is going to be, then may be you can just jump straight into the code here.

Say you’re fetching an array of Post objects at a GET endpoint and you also want to load the respective authors of the posts, embedding an author object within each of the post objects. Here’s a naive way of doing it:

class PostsController < ApplicationController def index posts = Post . all render json: posts end end class Post belongs_to :author , class_name: 'User' end class PostSerializer < ActiveModel :: Serializer attributes :id , :title , :details belongs_to :author end

For each of the n Post objects being rendered a query will run to fetch the corresponding User object, hence we’ll run a total of n+1 queries. This is disastrous. And here’s how you fix it by eager loading the User object:

class PostsController < ApplicationController def index # Runs a SQL join with the users table. posts = Post . includes ( :author ). all render json: posts end end

Until now there’s absolutely nothing new for veterans.

But let’s complicate this. Let’s assume that the site’s users are not being stored in the same RDBMS as the posts are, rather, the users are documents stored in MongoDB (for whatever reason). How do we modify our Post serializer to fetch the user now, optimally? This would be going back to square one:

class PostSerializer < ActiveModel :: Serializer attributes :id , :title , :details , :author # Will run n Mongo queries for n posts being rendered. def author User . find ( object . author_id ) end end # This is now a Mongoid document, not an ActiveRecord model. class User include Mongoid :: Document include Mongoid :: Timestamps # ... end

The predicament that our users now reside in a Mongo database can be substituted with, say, calling a 3rd party HTTP service for fetching the users or storing them in a completely different RDBMS. Our essential problem remains that there’s no way to ‘join’ the users datastore with the posts table and get the response we want in a single query.

Of course, we can do better. We can fetch the entire response in two queries:

Fetch all the posts without the author attribute (1 SQL query).

attribute (1 SQL query). Fetch all the corresponding authors by running a where-in query with the user IDs plucked from the array of posts (1 Mongo query with an IN clause).

posts = Post . all author_ids = posts . pluck ( :author_id ) authors = User . where ( :_id . in => author_ids ) # Somehow pass the author objects to the post serializer and # map them to the correct post objects. Can't imagine what # exactly that would look like, but probably not pretty. render json: posts , pass_some_parameter_maybe: authors

So our original optimization problem has been reduced to “how do we make this code readable and maintainable”. The folks at Universe have come up with an absolute gem (too obvious?). Batch Loader has been incredibly helpful to me recently.

gem 'batch-loader'

bundle install

class PostSerializer < ActiveModel :: Serializer attributes :id , :title , :details , :author def author object . get_author_lazily end end class Post def get_author_lazily # The current post object is added to the batch here, # which is eventually processed when the block executes. BatchLoader . for ( self ). batch do | posts , batch_loader | author_ids = posts . pluck ( :author_id ) User . where ( :_id . in => author_ids ). each do | user | post = posts . detect { | p | p . author_id == user . _id . to_s } #'Assign' the user object to the right post. batch_loader . call ( post , user ) end end end end

If you’re familiar with Javascript Promises, think of the get_author_lazily method as returning a Promise which is evaluated later. That’s a decent analogy, I think, since BatchLoader uses lazy Ruby objects.

One caveat: BatchLoader caches the loaded values and to keep the responses up-to-date you should add this to your config/application.rb :

config.middleware.use BatchLoader::Middleware

That’s basically it! We’ve solved an advanced version of the n+1 queries problem while keeping our code clean and using Active Model Serializers the right way.

One problem though. If you have a User serializer (Active Model Serializers work with Mongoid as well), that won’t be called for the lazily loaded author objects, unlike before. To fix this we can use a Ruby block and serialize the author objects before they’re ‘assigned’ to the posts.

class PostSerializer < ActiveModel :: Serializer attributes :id , :title , :details , :author def author object . get_author_lazily do | author | # Serialize the author after it has been loaded. ActiveModelSerializers :: SerializableResource . new ( author ) . as_json [ :user ] end end end class Post def get_author_lazily # The current post object is added to the batch here, # which is eventually processed when the block executes. BatchLoader . for ( self ). batch do | posts , batch_loader | author_ids = posts . pluck ( :author_id ) User . where ( :_id . in => author_ids ). each do | user | modified_user = block_given? ? yield ( user ) : user post = posts . detect { | p | p . author_id == user . _id . to_s } # 'Assign' the user object to the right post. batch_loader . call ( post , modified_user ) end end end end

Here’s the entire code. Enjoy!