A Few Postgres Essentials

Listen to this article

Postgres is our favorite database—it’s reliable, powerful and secure. Here are a few essential tips learned from building, and helping our customers build, apps around Postgres. These tips will help ensure you get the most out of Postgres, whether you’re running it on your own box or using the Heroku Postgres add-on.

Postgres connections are not free, as each established connection has a cost. By using a connection pooler, you’ll reduce the number of connections you use and reduce your overhead.

Most Postgres client libraries include a built-in connection pooler; make sure you’re using it.

You might also consider using our pgbouncer buildpack if your application requires a large number of connections. PgBouncer is a server-side connection pooler and connection manager that goes between your application and Postgres. Check out some of our documentation for using PgBouncer for Ruby and Java apps.

Postgres allows you to see what clients are connected and what each of them is doing using the built-in pg_stat_activity table.

By explicitly marking each connection you open with the name of your dyno, using the DYNO environment variable, you’ll be able to track what your application is doing at a glance:

SET application_name TO 'web.1';

Now, if you will be to quickly see what each dyno is doing, using heroku pg:ps :

$ heroku pg:ps procpid | source | running_for | waiting | query ---------+---------------------------+-----------------+---------+----------------------- 31776 | web.1 | 00:19:08.017088 | f | <IDLE> in transaction 31912 | worker.1 | 00:18:56.12178 | t | select * from customers; (2 rows)

You will also be able to see how many connections each dyno is using, and much more, by querying the pg_stat_activity table:

$ heroku pg:psql SELECT application_name, COUNT(*) FROM pg_stat_activity GROUP BY application_name ORDER BY 2 DESC; application_name | count ----------------------------+------- web.1 | 15 web.2 | 15 worker.1 | 5 (3 rows)

Long running queries can have an impact on your database performance because they may hold locks or over-consume resources. To avoid them, Postgres allows you to set a timeout per connection that will abort any queries exceeding the specified value. This is especially useful for your web dynos, where you don’t want any requests to run longer than your request timeout.

SET statement_timeout TO '30s';

Being able to determine which part of your code is executing a query makes optimization easier, and makes it easier to track down expensive queries or n+1 queries.

There are many ways to track which part of your code is executing a query, from a monitoring tool like New Relic to simply adding a comment to your SQL specifying what code is calling it:

SELECT `favorites`.`product_id` FROM `favorites` -- app/models/favorite.rb:28:in `block in <class:Favorite>'

You will now be able to see the origin of your expensive queries, and be able to track down the caller of the query when using the pg_stat_statements and pg_stat_activity tables:

$ heroku pg:psql SELECT (total_time/sum(total_time) OVER()) * 100 AS exec_time, calls, query FROM pg_stat_statements ORDER BY total_time DESC LIMIT 10; ---------------------------------------------------------------------------------------------------------------------------- exec_time | 12.2119460729825 calls | 7257 query | SELECT `favorites`.`product_id` FROM `favorites` -- app/models/product.rb:28:in `block in <class:Product>'

Many ORMs provide this feature built-in or via extensions, make sure you use it and your debugging and optimization will be easier.

There is much more you can learn about Postgres, either via the excellent documentation of the project itself, or the Heroku Postgres Dev Center reference. Share your own tips with the community on the #postgrestips hashtag.