When discussing the scalability of Web services there seems to be a tendency for some developers to overly focus on the framework being used, based on the mistaken assumption that smaller means faster, rather than considering the architecture of the application as a whole. We've seen several cases of developers making these assumptions before they start building their API, and either discounting Django as not being fast enough for their needs, or deciding to not use a Web API framework such as Django REST framework because they 'need something lightweight'.

I'm going to be making the case that Web API developers who need to build high performance APIs should be focusing on a few core fundamentals of the architecture, rather than concentrating on the raw performance of the Web framework or API framework they're using.

Using a mature, fully featured and well supported API framework gives you a huge number of benefits, and will allow you to concentrate your development on the things that really do matter, rather than expending energy reinventing the wheel.

Much of this article is Django/Python specific, but the general advice should be applicable to any language or framework.

Dispelling some myths

Before we get started, I'd like to first counter some bits of what I believe are misguided advice that we've seen given to developers who are writing Web APIs that need to service thousands of requests per second.

Roll your own framework.

Please: don't do this. If you do nothing else other than use Django REST framework's plain APIView and ignore the generic views, serialization, routers and other advanced functionality, you'll still be giving yourself a big advantage over using plain Django for writing your API. Your service will be a better citizen of the Web.

Use a 'lightweight' framework.

Don't conflate a framework being fully-featured with being tightly coupled, monolithic or slow. REST framework includes a huge amount of functionality out of the box, but the core view class is really very simple.

REST framework is tied to Django models.

Nope. There are a default set of generic views that you can easily use with Django models, and default serializer subclasses that work nicely with Django models, but those pieces of REST framework are entirely optional, and there's absolutely no tight coupling to Django's ORM.

Django/Python/REST framework is too slow.

As this article will argue, the biggest performance gains for Web APIs can be made not by code tweaking, but by proper caching of database lookups, well designed HTTP caching, and running behind a shared server-side cache if possible.

Profiling our views

We're going to take a broad look at profiling a simple Django REST framework API in order to see how it compares against using plain Django views, and get some idea where the biggest performance improvements gains can be had.

The profiling approach used here is absolutely not intended as a comprehensive performance metric, but rather to give a high level overview of the relative performance of various components in the API. The benchmarks were performed on my laptop, using Django's development server and a local PostgreSQL database.

For our test case we're going to be profiling a simple API list view.

1 2 3 4 5 6 7 8 9 10 class UserSerializer ( serializers . ModelSerializer ): class Meta : model = User fields = ( 'id' , 'username' , 'email' ) class UserListView ( APIView ): def get ( self , request ): users = User . objects . all () serializer = UserSerializer ( users ) return Response ( serializer . data )

There are various components that we're interested in measuring.

Database lookup . Here we're timing everything from the Django ORM down to the raw database access. In order to time this independently of the serialization we'll wrap the queryset call in a list() in order to force it to evaluate.

. Here we're timing everything from the Django ORM down to the raw database access. In order to time this independently of the serialization we'll wrap the queryset call in a in order to force it to evaluate. Django request/response cycle . Anything that takes place before or after the view method runs. This includes the default middleware, the request routing and the other core mechanics that take place on each request. We can time this by hooking into the request_started and request_finished signals.

. Anything that takes place before or after the view method runs. This includes the default middleware, the request routing and the other core mechanics that take place on each request. We can time this by hooking into the and signals. Serialization . The time it takes to serialize model instances into simple native python data structure. We can time this as everything that takes place in the serializer instantiation and .data access.

. The time it takes to serialize model instances into simple native python data structure. We can time this as everything that takes place in the serializer instantiation and access. View code . Anything that runs once an APIView has been called. This includes the mechanics of REST framework's authentication, permissions, throttling, content negotiation and request/response handling.

. Anything that runs once an has been called. This includes the mechanics of REST framework's authentication, permissions, throttling, content negotiation and request/response handling. Response rendering. REST framework's Response object is a type of TemplateResponse, which means that the rendering process takes place after the response has been returned by the view. In order to time this we can wrap APIView.dispatch in a superclass that forces the response to render before returning it.

Rather than use Python's profiling module we're going to keep things simple and wrap the relevant parts of the the code inside timing blocks. It's rough and ready (and it goes without saying that you wouldn't use this approach to benchmark a "real" application), but it'll do the job.

Timing the database lookup and serialization involves modifying the .get() method on the view slightly:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 def get ( self , request ): global serializer_time global db_time db_start = time . time () users = list ( User . objects . all ()) db_time = time . time () - db_start serializer_start = time . time () serializer = UserSerializer ( users ) data = serializer . data serializer_time = time . time () - serializer_start return Response ( data )

In order to time everything else that happens inside REST framework we need to override the .dispatch() method that is called as soon as the view is entered. This allows us to time the mechanics of the APIView class, as well as the response rendering.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 def dispatch ( self , request , * args , ** kwargs ): global dispatch_time global render_time dispatch_start = time . time () ret = super ( WebAPIView , self ) . dispatch ( request , * args , ** kwargs ) render_start = time . time () ret . render () render_time = time . time () - render_start dispatch_time = time . time () - dispatch_start return ret

Finally we measure the rest of the request/response cycle by hooking into Django's request_started and request_finished signals.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 def started ( sender , ** kwargs ): global started started = time . time () def finished ( sender , ** kwargs ): total = time . time () - started api_view_time = dispatch_time - ( render_time + serializer_time + db_time ) request_response_time = total - dispatch_time print ( "Database lookup | %.4f s" % db_time ) print ( "Serialization | %.4f s" % serializer_time ) print ( "Django request/response | %.4f s" % request_response_time ) print ( "API view | %.4f s" % api_view_time ) print ( "Response rendering | %.4f s" % render_time ) request_started . connect ( started ) request_finished . connect ( finished )

We're now ready to run the timing tests, so we create ten users in the database. The API calls are made using curl , like so:

1 curl http://127.0.0.1:8000

We make the request a few times, and take an average. We also discount the very first request, which is skewed, presumably by some Django routing code that only need to run at the point of the initial request.

Here's our first set of results.

Action Time (s) Percentage Database lookup 0.0090 65.7% Serialization 0.0025 18.2% Django request/response 0.0015 10.9% API view 0.0005 3.6% Response rendering 0.0002 1.5% Total 0.0137

Removing serialization

Let's simplify things slightly. At the moment we're returning fully fledged model instances from the ORM, and then serializing them down into simple dictionary representations. In this case that's a little unnecessary - we can simply return simple representations from the ORM and return them directly.

1 2 3 4 class UserListView ( APIView ): def get ( self , request ): data = User . objects . values ( 'id' , 'username' , 'email' ) return Response ( data )

Serializers are really great for providing a standard interface for your output representations, dealing nicely with input validation, and easily handling cases such as hyperlinked representations for you, but in simple cases like this they're not always necessary.

Once we've removed serialization, our timings look like this.

Action Time (s) Percentage Database lookup 0.0090 80.4% Django request/response 0.0015 13.4% API view 0.0005 4.5% Response rendering 0.0002 1.8% Total 0.0112

It's an improvement (we've shaved almost 20% off the average total request time), but we haven't dealt with the biggest issue, which is the database lookup.

Cache lookups

The database lookup is still by far the most time consuming part of the system. In order to make any significant performance gains we need to ensure that the majority of lookups come from a cache, rather than performing a database read on each request.

Ignoring the cache population and expiry, our view now looks like this:

1 2 3 4 class UserListView ( APIView ): def get ( self , request ): data = cache . get ( 'users' ) return Response ( data )

We set up Redis as our cache backend, populate the cache, and re-run the timings.

Action Time (s) Percentage Django request/response 0.0015 60.0% API view 0.0005 20.0% Redis lookup 0.0003 12.0% Response rendering 0.0002 8.0% Total 0.0025

As would be expected, the performance difference is huge. The average request time is over 80% less than our original version. Lookups from caches such as Redis and Memcached are incredibly fast, and we're now at the point where the majority of the time taken by the request isn't in view code at all, but instead is in Django's request/response cycle.

Slimming the view

The default settings for REST framework views include both session and basic authentication, plus renderers for both the browsable API and regular JSON.

If we really need to squeeze out every last bit of possible performance, we could drop some of the unneeded settings. We'll modify the view to stop using proper content negotiation using the IgnoreClientContentNegotiation class demonstrated in the documentation, remove the browsable API renderer, and turn off any authentication or permissions on the view.

1 2 3 4 5 6 7 8 9 class UserListView ( APIView ): permission_classes = [] authentication_classes = [] renderer_classes = [ JSONRenderer ] content_negotiation_class = IgnoreClientContentNegotiation def get ( self , request ): data = cache . get ( 'users' ) return Response ( data )

Note that we're losing some functionality at this point - we don't have the browsable API anymore, and we're assuming this view can be accessed publicly without any authentication or permissions.

Action Time (s) Percentage Django request/response 0.0015 71.4% Redis lookup 0.0003 14.3% API view 0.0002 9.5% Response rendering 0.0001 4.8% Total 0.0021

That makes some tiny savings, but they're not really all that significant.

Dropping middleware

At this point the largest potential target for performance improvements isn't anything to do with REST framework, but rather Django's standard request/response cycle. It's likely that a significant amount of this time is spent processing the default middleware. Let's take the extreme approad and drop all middleware from the settings and see how the view performs.

Action Time (s) Percentage Django request/response 0.0003 33.3% Redis lookup 0.0003 33.3% API view 0.0002 22.2% Response rendering 0.0001 11.1% Total 0.0009

In almost all cases it's unlikely that we'd get to the point of dropping out Django's default middleware in order to make performance improvements, but the point is that once you're using a really stripped down API view, that becomes the biggest potential point of savings.

Returning regular HttpResponses

If we're still need a few more percentage points of performance, we can simply return a regular HttpResponse from our views, rather than returning a REST framework Response . That'll give us some very minor time savings as the full response rendering process won't need to run. The standard JSON renderer also uses a custom encoder that properly handles various cases such as datetime formatting, which in this case we don't need.

1 2 3 4 5 6 7 8 9 class UserListView ( APIView ): permission_classes = [] authentication_classes = [] renderer_classes = [ JSONRenderer ] content_negotiation_class = IgnoreClientContentNegotiation def get ( self , request ): data = cache . get ( 'users' ) return HttpResponse ( json . dumps ( data ), content_type = 'application/json; charset=utf-8' )

This gives us a tiny saving over the previous case.

Action Time (s) Percentage Django request/response 0.0003 37.5% Redis lookup 0.0003 37.5% API view 0.0002 25% Total 0.0008

The final step would be to stop using REST framework altogether and drop down to a plain Django view. This is left as an exercise for the reader, but hopefully by now it should be clear that we're talking about tiny performance gains that are insignificant in all but the most extreme use-cases.

Comparing our results

Here's the full set of results from each of our different cases.

The areas in pink/red tones indicate bits of REST framework, the blue is Django, and the green is the database or Redis lookup.