Photo by Samuel Zeller on Unsplash

The need for multi-tenant applications is not new, but with the rising popularity of Software-as-a-Service (SaaS) offerings , you are building something wrong if you don’t consider this a core requirement. Multi-tenancy in software architecture is where a single instance of an application serves multiple customers (each customer is called a tenant). A customer could be a whole corporation, or a division of a business, or Todd the developer.

Photo by rawpixel on Unsplash

There are many approaches to solving this problem. The most common focus area is around the data stored for a tenant. Approaches range from a database per customer to shared tables with a column specifying the customer. Utilizing a database or table naming convention (i.e. customer prefix, etc) allows developers to continue to think in a single customer focus and also segregates the data to some extent from other customer data. The customer column-based approach basically says add a WHERE clause to everything. Each can have their own drawbacks; database creation can take too long, naming conventions can add code complexity, and the column-based approach can hit performance issues quickly as all the customer data is forced together.

Multi-tenancy and PostgresSQL

Photo by Matthew Spiteri on Unsplash

PostgresSQL is the most commonly utilized database for Django applications. Postgres has a feature, schemas, which harnesses upsides of the database/table driven approach while tackling some of the complexity and performance issues. Schemas provide the behavior of a database, but are faster to create. With schemas the tables for each tenant can be identical which reduces the complexity, but can also reside in the same database along with shared data that is tenant independent. This segregation of data also allows you to avoid the drawback of the column-based approach which can run into performance problems and has the added wrinkle that query speed on a tenant with only a small amount of data is impacted by all other tenant data. While schemas are a nice feature it still doesn’t make multi-tenancy simple.

Luckily the opensource community has helped contribute libraries that indeed make developing with PostgresSQL schemas simple. There are a variety of libraries now available; our team employed django-tenant-schemas to help solve this problem. django-tenant-schemas folds nicely into the Django life-cycle. Schema creation triggers migrations to run and allows you to maintain the normal Django development process. The module also provides a context manager that make it simple to select the schema you want to interact with.

Context Manager for Accessing Tenant Data

Additionally, you can differentiate between shared data and tenant data simply by specifying configuration.

Django Tenant Schemas in Action

While the documentation for django-tenant-schemas is quite good, we did not use their standard implementation out-of-the-box. The default implementation uses the notion of creating the tenant schemas based on the domain URL; for example think of each tenant having their own subdomain for accessing data (i.e. {tenant}.myapp.io) and deriving the schema name from that pattern. While subdomains is a good pattern it may not fit your needs; it didn’t match our requirements.

In our case tenancy could be determined by data in a header that provided account information. Luckily this was a common enough pattern that we also found example documentation for tackling this flow. While examples are great, it may be helpful to see a working application with a bit more complexity where we have multiple shared apps and tenant apps.

Shared & Tenant-specific Schema Configuration

You can also dive into our tenant middleware to see our flow for not only creating tenants based on header information but also creating customer and user objects associated with the request.

Deploying on OpenShift with Source-to-Image (S2I)

Now that we’ve explored the building blocks of a multi-tenant Django app, we must consider the implications on continuous deployment and data migrations as new facets/features are implemented for a project. If you are a seasoned Django developer you are familiar with the migration flow. As mentioned earlier django-tenant-schemas fits into the Django life cycle, for example supplying its own command, migrate-schemas, to apply migrations across schemas. With these pieces available, how best to fit them into your continuous deployment strategy?

OpenShift affords a build/deployment mechanism called Source-to-Image (s2i) for marrying base images with code from source control (like GitHub). Source-to-image supports a standard flow of executing assemble and run scripts that are triggered by the base image after the code source has been downloaded. If you dive into these scripts you can see how it prepares the container based on the provided source.

Schema Migration with s2i

The above script shows the stage in the run script that was altered to support tenant migrations with the migrate_schemas command. With this in place you can migrated your database structure and add new tables both in your shared and tenant schemas within a continuous deployment setting.

Example of Tenant Schemas in Action

Synopsis

In this story we highlighted the need for considering multi-tenancy in your software design along with some common patterns of implementation. In the Django development world we found that PostgresSQL provides schemas, which allows a balanced approach in context of the common patterns and their drawbacks. Even more importantly the opensource community has made utilizing schemas simple with Django. Lastly, we considered the impacts on migration and continuous deployment with s2i on Openshift and explored the necessary updates.