Migrations

As our customer base grew to 100+ customers and as our application grew to 100+ tables, we started to notice things slowing down. I don’t have precise numbers to back this up, but we found a direct correlation between the number of Postgres schemas, the size of the tables in those schemas being altered while running migrations and the time it took to complete the migration. The more schemas/larger the tables, the longer migrations took.

It’s well known that index changes to large tables can take time and even cause table locking. Ideally, for things like column additions, you’d get constant O(1) performance. Under a separate schemas approach to tenancy, you now get O(N) performance where N=# tenants. Now when you get into less predictable changes like index additions, it’s a hell of a lot worse. I don’t know exactly how to represent that in Big O, but I think it would look something like O(WTFxN).

Migrations started to become the bane of our existence, which meant that deploying started to become a hassle. No one wants friction in their deployment process, especially as we’re attempting to deploy daily or more frequently.

Database Resources

I’m not an expert on the inner workings of PostgreSQL, but the amount that we’ve needed to scale out our main production database when using this tenancy strategy seems to dramatically outpace that on any of our other services that use column scoping for tenancy. My guess is there’s an upper limit — if not a hard one, then at least a soft, recommended one — on the number of tables/indexes etc that you store within one postgres db. We’re running an RDS r3.4xl which costs about $3k monthly, to house a database that should otherwise be able to live on something much smaller. We haven’t exactly dug into this, but I’m quite sure the sheer number of tables we have is an issue.

Client Memory Bloat

This one is relevant to Ruby and more specifically ActiveRecord (but possibly any other library with a similar implementation) Although some fixes have been made, the root cause remains largely unfixed. ActiveRecord, on connecting to the database, iterates through all tables and stores metadata about the columns in order to properly map Postgres data types to Ruby ones. Unfortunately the way this happens is by iterating through all tables in all schemas, then caching what it finds. This unnecessarily bloats the running client given that all the types are the exact same across all tenants, but we’re unable to configure ActiveRecord to just map through a single schema.

Right now, the moment any of our ruby processes connect to the db, it grows to about 500mb in memory right off the bat. Other services with similar data amounts, but not using schema based tenanting do NOT have this problem. This of course will continue to get worse with each customer (tenant/schema) we add to the system.

Record Identification

One major drawback of schemas as tenants is that your sequence generators will reside independently in each tenant. This means, if you have a tenanted user table, you now have X number of users identified with id=1 (and for every other generated sequence id assuming uniform distribution). If you ever try to join across these or report globally on all of this, you’re going to get some conflict. Furthermore, one could actually get some permissions issues if you ever copy this data to other systems without scoping things to a tenant id.