Being stuck there, we dig in the whole internet and in the PG documentation, and found those materialized view

What’s a materialized view ?

Creation of Materialized View is an extension, available since Postgresql 9.3.

MatViews are widely available in other RDBMS such as Oracle, or SQL Server since longtime.

A MatView is in between a view and a table. Basically it’s built with a query refering to one or more tables, and the results are stored physically, making it acting like a cache.

This is the main difference with a simple view, which queries it’s source each time you call it. This means that you may wait for a while before getting your result.

A MatView can be used like a regular table, for example, you can add indexes or primary key on it, it supports VACUUM and ANALYZE commands (useful when you refresh it often), you can even create it without data, just with the definition query.

Bonus point: you can mess up your source table, your end-user won’t notice it before the refresh.

Our MatView-oriented solution:

same but with matview, but still same…

Creating the materialized view

A materialized view creation looks like the creation of a view or a CREATE TABLE AS instruction:

Piece of cake.

Refreshing the materialized view

Now, if the source table views is updated and you want your materialized view to take those updates in account, you’ll must refresh it manually:

REFRESH MATERIALIZE VIEW job_offer_views_mv;

Well, that’s really sad ! Other RDBMS can do this automatically, and with simple view, the results would have been up to date (but if your query is heavy, you don’t want a simple view).

So it’s either you accept some decrepencies in your data or, you can use a trigger to refresh your MatView when the source is updated (not detailed here, maybe in a future post).

We were happy, this works well in staging, but …

funnier in production

The trick: refresh materialized view CONCURRENTLY

There’s not a lot of users in our staging environment, because of that, we missed two things:

users generates read locks, so refreshing the MatView can take a while,

refreshing the MatView locks any new read from users, making our API pretty slow

Actually, the ‘basic’ refresh is useful, and fast if the table is not used often (like several times a day). Otherwise, use refresh concurrently.

But beware! To use the refresh concurrently, you must define at least one unique index on your materialized view.

This is obvious regarding the way the refresh concurrently works. It refreshes the rows without locking concurrent select, so it needs to identify ‘free’ rows. It may take longer than the simple refresh, but it won’t bother your users.

That’s it ! Production is now safe, refresh every 2hours and these endpoints are pretty fast (< 100 ms).