GraphQL puts the user in control of their own destiny. Yes, they are confined to your schema, but beyond that they can access the data in any which way. Will they ask only for the "events", or also for the "category" of each event? We don't really know! In REST based APIs we know ahead of time what will be rendered, and can plan ahead by generating the required data efficiently, often by eager-loading the data we know we'll need.

In this article, we will discuss what N+1 queries are, how they are easily produced in GraphQL, and how to solve them using the graphql-batch gem along with a few custom batch loaders.

The source code for this article is available on GitHub.

What are N+1 queries?

N+1 queries can occur when you have one-to-many relationships in your models. Each Event belongs to a Category . Let's say that you find the last five events and you want to get the category name for each of them.

Event . last ( 5 ) . each { | event | puts event . category . name }

Seems simple enough! We unfortunately just produced six queries. The first query to find the events, and another query to find each category's name. This is an easy problem to solve in Rails by using eager-loading:

Event . includes ( :category ) . last ( 5 ) . each { | event | puts event . category . name }

By using the includes method we've been able to knock our queries down from six to two: The first to find the events, and the second to find the categories for those events.

N+1 queries in GraphQL

As we mentioned earlier, in GraphQL, the user is in charge of their own destiny. They may or may not ask for the category name for each event. The query below will produce N+1 SQL queries as it finds the category for each event:

{ events { id name category { id name } } }

Optimizing GraphQL queries

Yes, we could solve the N+1 query in the previous example by eager-loading the category relationship, but if the user didn't actually want the category, why load it? We don't know what the user will ask for. There just so happens to be a better way, by lazy-loading data only as its needed using the graphql-batch gem.

Marc-André Giroux has written a very thorough article about the GraphQL Dataloader Pattern which I highly recommend reading before continuing.

Batch loading single records

The simplest case for batch loading data is the example of each event belonging to a category. Inside of our EventType class, there is a field called category which allows the user to access the category of an event.

class Types : : EventType < Types : : BaseObject field :category , Types : : CategoryType , null : false def category RecordLoader . for ( Category ) . load ( object . category_id ) end end

By using the RecordLoader class to load the category, we actually avoid loading the category right away, and instead load all of the required categories with a single query. The query it ends up producing may end up looking like:

SELECT "categories" . * FROM "categories" WHERE "categories" . "id" IN ( $ 1 , $ 2 , $ 3 , $ 4 , $ 5 )

Looking at the RecordLoader class we can see how it works. The perform method will receive all of the ids for a single model (Category in this case), load the records in a single SQL query, and then call the fulfill method for each of them. The fulfill method resolves the promise, which is basically like putting a face to a name... you gave me an ID, and I've fulfilled my promise to provide you with the corresponding record.

class RecordLoader < GraphQL : : Batch : : Loader def initialize ( model ) @model = model end def perform ( ids ) @model . where ( id : ids ) . each { | record | fulfill ( record . id , record ) } ids . each { | id | fulfill ( id , nil ) unless fulfilled ? ( id ) } end end

We can write a test for this class to ensure that it finds the records correctly, keeping in mind that in order for the lazy/promise based code to function correctly it needs to be wrapped inside something called an executor .

describe RecordLoader do it 'loads' do event = create ( :event ) result = GraphQL : : Batch . batch do RecordLoader . for ( Event ) . load ( event . id ) end expect ( result ) . to eq ( event ) end end

Batch loading many records

We've covered the case where we are batch loading a single record at a time, but how do we handle the reverse scenario? We are displaying categories along with the first five events for each category, which would also produce an N+1 query, so let's see how we can solve it using a batch loader. The query we're discussing would look something like this:

{ categories { id name events ( first : 5 ) { id name } } }

I have created a custom loader called ForeignKeyLoader for this purpose. It will load the events using the foreign key category_id . I also added the ability to pass a lambda to merge in additional scopes into the query that will be run.

class Types : : CategoryType < Types : : BaseObject field :events , [ Types : : EventType ] , null : false do argument :first , Int , required : false , default_value : 5 end def events ( first : ) ForeignKeyLoader . for ( Event , :category_id , merge : - > { order ( id : :asc ) } ) . load ( object . id ) . then do | records | records . first ( first ) end end end

The query that gets produced looks something like:

SELECT "events" . * FROM "events" WHERE "events" . "category_id" IN ( $ 1 , $ 2 , $ 3 , $ 4 , $ 5 ) ORDER BY "events" . "id" ASC

Notice in this case that we call the then method to execute some code after the promise has been resolved. Here we see the first issue with this method... we only wanted five events for each category, but our query will load ALL events for each category, and then, using the first method on the resulting Array, narrow it down to only the first five events. If there are thousands of events, we could run into some serious issues.

class ForeignKeyLoader < GraphQL : : Batch : : Loader attr_reader :model , :foreign_key , :merge def self . loader_key_for ( * group_args ) [ self ] . concat ( group_args . slice ( 0 , 2 ) ) end def initialize ( model , foreign_key , merge : nil ) @model = model @foreign_key = foreign_key @merge = merge end def perform ( foreign_ids ) scope = model . where ( foreign_key = > foreign_ids ) scope = scope . merge ( merge ) if merge . present ? records = scope . to_a foreign_ids . each do | foreign_id | matching_records = records . select do | r | foreign_id == r . send ( foreign_key ) end fulfill ( foreign_id , matching_records ) end end end

Batch loading many records more efficiently

It turns out that there is a way to perform a query that says "find me the first N records for each X" (find me the first 5 records for each category), and that involves using Postgres Window Functions. While researching this concept, this article about window functions was useful along with this article about bringing window functions into Rails.

The following query produces the data that we want... we just need to figure out how to write a batch loader that generates the same result.

SELECT "events" . * FROM ( SELECT * , row_number ( ) OVER ( PARTITION BY category_id ORDER BY start_time desc ) as rank FROM "events" WHERE "events" . "category_id" IN ( 1 , 2 , 3 , 4 , 5 ) ) as events WHERE rank <= 5

For this we'll create a batch loader called WindowKeyLoader which is used like:

class Types : : CategoryType < Types : : BaseObject field :events , [ Types : : EventType ] , null : false do argument :first , Int , required : false , default_value : 5 end def events ( first : ) WindowKeyLoader . for ( Event , :category_id , limit : first , order_col : :start_time , order_dir : :desc ) . load ( object . id ) end end

You can see the difference already. I am no longer required to slice the first N array elements in the then block of the resolved promise. The actual batch loader class looks like:

class WindowKeyLoader < GraphQL : : Batch : : Loader attr_reader :model , :foreign_key , :limit , :order_col , :order_dir def initialize ( model , foreign_key , limit : , order_col : :id , order_dir : :asc ) @model = model @foreign_key = foreign_key @limit = limit @order_col = order_col @order_dir = order_dir end def perform ( foreign_ids ) ranked_from = model . select ( " * , row_number ( ) OVER ( PARTITION BY ) as rank" ) . where ( foreign_key = > foreign_ids ) . to_sql records = model . from ( "( #{ ranked_from } ) as #{ model . table_name } " ) . where ( "rank <= #{ limit } " ) . to_a foreign_ids . each do | foreign_id | matching_records = records . select do | r | foreign_id == r . send ( foreign_key ) end fulfill ( foreign_id , matching_records ) end end end

We're able to test the WindowKeyLoader by creating three events for a category but only asking for the first two of them:

describe WindowKeyLoader do it 'loads' do category = create ( :category ) events = ( 1. .3 ) . to_a . map do | n | create ( :event , name : "Event #{ n } " , category : category ) end result = GraphQL : : Batch . batch do WindowKeyLoader . for ( Event , :category_id , limit : 2 , order_col : :id , order_dir : :asc ) . load ( category . id ) end expect ( result ) . to eq ( events . first ( 2 ) ) end end

Batch loading active storage attachments

You may run into situations where you're loading polymorphic data, or other types of relationships which don't exactly fit into the mold of your standard has-many or belongs-to relationships. One case is with ActiveStorage. In the code below we'll load an image URL for an event:

class Types : : EventType < Types : : BaseObject field :image , String , null : true def image AttachmentLoader . for ( :Event , :image ) . load ( object . id ) . then do | image | url_for ( image . variant ( { quality : 75 } ) ) end end end

This data is stored using a polymorphic relationship that loads an ActiveStorage::Attachment record, which then needs to load an ActiveStorage::Blob record in order to produce the image url. It ends up producing a 2N + 1 query... yikes! Our AttachmentLoader is able to completely optimize this field by cutting it down to just two queries to load as many images as you'd like.

class AttachmentLoader < GraphQL : : Batch : : Loader attr_reader :record_type , :attachment_name def initialize ( record_type , attachment_name ) @record_type = record_type @attachment_name = attachment_name end def perform ( record_ids ) ActiveStorage : : Attachment . includes ( :blob ) . where ( record_type : record_type , record_id : record_ids , name : attachment_name ) . each { | record | fulfill ( record . record_id , record ) } record_ids . each { | id | fulfill ( id , nil ) unless fulfilled ? ( id ) } end end

In this case we are taking advantage of eager-loading, because for each attachment we will need its corresponding blob record.

Conclusion

GraphQL can be as efficient as REST, but requires approaching optimizations from a different angle. Instead of upfront optimizations, we lazy-load data only when required, loading it in batches to avoid excess trips to the database. In this article, we covered techniques to load single records, multiple records, and records with different types of relationships, as is the case with Active Storage which has a polymorphic relationship.

Share this article: If you liked this article we'd appreciate it if you'd tweet it to your peers.

About the Author

Leigh Halliday is a guest author for the pganalyze blog. He is a developer based out of Canada who works at FlipGive as a full-stack developer. He writes about Ruby and React on his blog and publishes React tutorials on YouTube.