Inefficient database querying is one of the most common performance pitfalls with Django. N+1 queries in particular can have a negative impact on your application's performance early on. They occur when you select records from an associated table using an individual query for each record rather than grabbing all records in a single query. Such inefficiencies are unfortunately quite easy to introduce with the Django ORM. That being said, they are something you can quickly uncover and prevent via automated testing.

This post looks at how to:

Test the number of queries executed by a request along with the duration of the queries Prevent N+1 queries using the nplusone package

Contents

N+1 Queries

The sample app that we'll be working with throughout this post can be found on GitHub.

Say, for example, you're working with a Django application that has the following models:

# courses/models.py from django.db import models class Author ( models . Model ): name = models . CharField ( max_length = 100 ) def __str__ ( self ): return self . name class Course ( models . Model ): title = models . CharField ( max_length = 100 ) author = models . ForeignKey ( Author , on_delete = models . CASCADE ) def __str__ ( self ): return self . title

Now, if you're tasked with creating a new view for returning a JSON response of all courses with the title and author name, you could write the following code:

# courses/views.py from django.http import JsonResponse from courses.models import Course def all_courses ( request ): queryset = Course . objects . all () courses = [] for course in queryset : courses . append ( { "title" : course . title , "author" : course . author . name } ) return JsonResponse ( courses , safe = False )

This code will work, but it's very inefficient since it will make far too many database queries:

1 query for obtaining all the courses

N queries for obtaining the branch in each iteration

Before addressing this, let's look at just how many queries are made and measure the execution time.

Metrics Middleware

You'll notice that the project includes custom middleware that calculates and logs the execution time of each request:

# core/middleware.py import logging import time from django.db import connection , reset_queries def metric_middleware ( get_response ): def middleware ( request ): reset_queries () # Get beginning stats start_queries = len ( connection . queries ) start_time = time . perf_counter () # Process the request response = get_response ( request ) # Get ending stats end_time = time . perf_counter () end_queries = len ( connection . queries ) # Calculate stats total_time = end_time - start_time total_queries = end_queries - start_queries # Log the results logger = logging . getLogger ( "debug" ) logger . debug ( f "Request: {request.method} {request.path} " ) logger . debug ( f "Number of Queries: {total_queries} " ) logger . debug ( f "Total time: {(total_time):.2f}s" ) return response return middleware

Run the database seed command to add 10 authors and 100 courses to the database:

$ python manage.py seed_db

With the Django development server up and running, navigate to http://localhost:8000/courses/ in your browser. You should see the JSON response. Back in your terminal, take note of the metrics:

Request: GET /courses/ Number of Queries: 101 Total time: 0 .10s

That's a lot of queries! This is very inefficient. Each additional author and course added will require an additional database query, so performance will continue to degrade as the database grows. Fortunately, the fix for this is quite simple: You can add a select_related method to create a SQL join which will include the authors in the initial database query.

queryset = Course . objects . select_related ( "author" ) . all ()

Before making any code changes, let's first start with some tests.

Performance Tests

Start with the following test, which uses the django_assert_num_queries Pytest fixture to ensure that the database is hit only once when there is one or more author and course records present in the database:

import json import pytest from faker import Faker from django.test import override_settings from courses.models import Course , Author @pytest . mark . django_db def test_number_of_sql_queries_all_courses ( client , django_assert_num_queries ): fake = Faker () author_name = fake . name () author = Author ( name = author_name ) author . save () course_title = fake . sentence ( nb_words = 4 ) course = Course ( title = course_title , author = author ) course . save () with django_assert_num_queries ( 1 ): res = client . get ( "/courses/" ) data = json . loads ( res . content ) assert res . status_code == 200 assert len ( data ) == 1 author_name = fake . name () author = Author ( name = author_name ) author . save () course_title = fake . sentence ( nb_words = 4 ) course = Course ( title = course_title , author = author ) course . save () res = client . get ( "/courses/" ) data = json . loads ( res . content ) assert res . status_code == 200 assert len ( data ) == 2

Not using Pytest? Use the assertNumQueries test method in place of django_assert_num_queries .

What's more, we can also use nplusone to prevent the introduction of future N+1 queries. After installing the package and adding it to the settings file, you can add it to your tests with the @override_settings decorator:

... @pytest . mark . django_db @override_settings ( NPLUSONE_RAISE = True ) def test_number_of_sql_queries_all_courses ( client , django_assert_num_queries ): ...

Or if you'd like to automatically enable nplusone across the entire test suite, add the following to your test root conftest.py file:

from django.conf import settings def pytest_configure ( config ): settings . NPLUSONE_RAISE = True

Conclusion

This post looked at how to prevent N+1 queries automatically in your codebase using the nplusone package and test the number of queries executed using the django_assert_num_queries Pytest fixture.

This should help prevent performance bottlenecks as your applications grows and takes on more users. If you're adding this to an existing codebase, you'll probably need to spend some time fixing broken queries so that they have a constant number of database hits. If you're still running into performance issues after the fixes and optimizations, you may need to add additional caching layers, denormalize parts of the database, or configure database indexes.