10 Reasons to love SQLAlchemy

Paul Johnston 4 Sept 2015

I've seen some bashing of ORMs of late, which I think is totally undeserved. I'm a big fan of SQLAlchemy; I use it in most of my projects, and have contributed code to the library. I'm going to explain 10 good reasons you should love SQLAlchemy too! In fairness, there are a number of good ORMs around, and most of these reasons apply to them as well. But SQLAlchemy is my favorite.

1) Define the database schema in application code

SQLAlchemy lets you define the database schema in Python code. Here is an example from an e-commerce site, representing an item within an order:

class OrderItem(Base): id = Column(Integer, primary_key=True) order = many_to_one('Order') product = many_to_one('Product') quantity = Column(Integer)

The Python code that defines your schema is called the model. Because these are Python classes, you can add your own methods. This keeps related functionality together, making maintenance easier.

class Order(Base): ... def update_stock(self): for item in self.items: item.product.stock -= item.quantity

It also means the schema can be kept in version control, with all the associated benefits: version tracking, tagging, blame, etc.

2) Automatically synchronise the model and schema

Alembic is an SQLAlchemy add-on for database management. When you make changes to the Python model, Alembic can automatically update the database schema. This makes minor changes, such as adding a table or column, quick and easy.

$ alembic upgrade head INFO [alembic.context] Context class PostgresqlContext. INFO [alembic.context] Will assume transactional DDL. INFO [alembic.context] Running upgrade None -> 1975ea83b712

While automatic synchronisation is convenient in a development environment, most people don't want so much magic being applied to a production database. Alembic can automatically generate change scripts, which a DBA can review before applying to production.

3) The Pythonic style makes your code a pleasure to read

SQLAlchemy presents database relations in a Pythonic manner, which is convenient for application code. Consider this example, which prints out all the products in an order:

for item in order.items: print(item.product.name, item.quantity)

The code is simple and expressive, while traversing two database joins. order.items is a one-to-many relation, and SQLAlchemy will automatically load the OrderItem objects related to the order. item.product is a many-to-one relation, and SQLAlchemy will automatically load the correct product.

SQLAlchemy also instruments classes. If application code modifies a mapped property, the object is automatically queued for writing. This allows much application logic to be unaware of persistence concerns.

4) Construct queries in Python

Simple queries, such as loading an object from a primary key require little code:

order = session.query(Order).get(order_id)

More complex queries can be performed using a Python query syntax. For example, to find active orders that are more than two days old, we can:

overdue_orders = session.query(Order).filter(Order.status == 'active' && Order.time < datetime.now() - timedelta(days=2))

The syntax allows you to mix SQL clauses and Python variables, and is secure against SQL injection attacks. Internally, SQLAlchemy has overloaded the various comparison operators, and translates these to SQL.

When performing highly complex queries, it is possible to define these with SQLAlchemy syntax. However, I find there's a certain level of complexity where it becomes easier to write SQL directly. In that case, you can define a database view that encompasses the complex query, and SQLAlchemy can map the view to Python objects.

5) Seamless integration with web frameworks

Some frameworks, such as Pyramid, have built-in SQLALchemy support. For other frameworks you need to install an integration library, such as Flask-SQLAlchemy or aldjemy (for Django).

SQLAlchemy maintains a connection pool, with an open database connection being provided to each web request. The library handles common errors effectively, making applications robust against scenarios such as the database being restarted while the application is running.

Each request is wrapped in a transaction. If the request is successful then the transaction is committed; otherwise it is rolled back. This arrangement allows controller methods to interact with the database correctly, with no explicit database handling code.

6) Eager loading for performance

Most ORMs use lazy loading. The first time a relation is used, an SQL query is issued to load the data. In the example above, the use of order.items causes an SQL query, then each use of item.product causes another query. Because item.product is used within a loop, this can generate a large number of SQL queries, resulting in poor performance. This is called the n+1 selects issue.

SQLAlchemy has a solution: eager loading. When we first load the Order object, we can tell SQLAlchemy what relations we will be using, so it can load all the data in a single query. The syntax looks like this:

session.query(Order).options(joinedload_all('items.product')).get(order_id)

7) Transparent polymorphism support

Object-orientated languages like Python encourage the use of polymorphism. If we have a base class like Person, it's useful to have sub classes like Employee or Customer that have additional fields. However, traditional SQL databases do not support polymorphism, which makes life difficult for an ORM.

SQLAlchemy does a decent job of emulating polymorphism in SQL. It is natural to use from Python code, while the data stored in the database is convenient to query from SQL. This works well enough that application code can simply use polymorphic classes without worrying how they are stored.

8) Works with legacy databases

Some ORMs make assumptions about the structure of your database. They may enforce that every table has a single primary key column, or even that the primary key is called "id". If you are building a database from scratch then these restrictions are no problem. But if you are using a legacy database, they could prevent you accessing certain tables.

SQLAlchemy makes no assumptions about the database structure, so it works great with legacy databases. There is also a tool called sqlacodegen that generates an SQLAlchemy model from an existing database. This makes it easy to write simple Python scripts that interact with legacy databases.

9) Loads of hooks for customising the library

SQLAlchemy is structured in a clear, layered fashion. Almost any aspect of the library can be overriden to customise for your needs.

I found a particularly useful application for this while working on a multi-tennant cloud application. Most of the queries in the application included a filter, to only return results from the current tennant. For example:

products = session.query(Product).filter(Product.merchant == current_user.merchant).all()

I was concerned that forgetting this filter anywhere would result in a user being able to see another merchant's data. However, I was able to create a custom SQLAlchemy session factory, which automatically applies the filter to all queries. This simplified controller code while keeping the application secure.

10) Great documentation

Some open source projects have patchy documentation, but not SQLAlchemy. The documentation is very thorough. There are tutorials which walk you from quick start examples through to advanced features. These are coupled with a comprehensive API reference.