Your database may be silently headed for trouble. PostgreSQL, like all modern MVCC-based relational databases, is subject to what’s called “bloat.” The consequences range from slowdowns and wasted space to transaction id wraparound – aka the “night of the living dead” when deleted rows come back to life.

Bloat is the disk space claimed by unneeded data rows and indices. Quinn Weaver from Postgres Experts explains the causes and consequences of bloat as well as its prevention and steps for emergency intervention.

Download Video: (HD / SD).

Summary

Why do Postgres databases get bloat? Couldn’t you just not do that vacuum thing? It causes so many problems!

Vacuuming is necessary for MVCC systems like Postgres Multi-version concurrency control When you delete a row it doesn’t really go away, instead it becomes invisible to future transactions Transactions started before the deletion continue to see the row Likewise an update is just a delete followed by insert The transaction visibility of every row is stored in the hidden (but accessible) xmin and xmax` columns on every table Example of MVCC

What’s good about MVCC? It’s efficient In particular it avoids the locking found in earlier relational databases Works with a large number of concurrent transactions It’s now in use by all relational databases

What’s bad about MVCC? Rows persist even when they are no longer visible to any current or future transaction That is what we call bloat Indices also get bloat Leaf nodes point at dead rows, and inner nodes point at those superfluous leaf nodes, etc etc You can’t actually remove an index pointer until its entire page stops pointing at rows Hence indices tend to stay bloated more than tables do

What are the consequences of bloat? It fills up pages (you have to read more pages into memory from disk) Slows queries down Have to keep more things in memory, some of which is junk Bloat can in fact push the working set to be greater than ram, and you spill to disk and life sucks Bloated indices take more steps to traverse They can get really big

What can you do about bloat? Something needs to garbage collect dead rows That something is the auto-vacuum daemon It’s been around for about eleven years Usually its default settings do a pretty good job But if you have an especially high-traffic site it might not keep up

How do you know when bloat is happening? You don’t want to prematurely optimize Scripts at pgexperts/pgx_scripts Example output from the table an index bloat scripts Even large absolute values of bloat can be OK as long as the percentage is stable Speed of bloat growth is important Run bloat queries on a cron job and review it weekly to review the direction of change

Fixing bloat If you’re measuring increasing bloat then auto-vacuum is not keeping up You can tune auto-vacuum parameters to stay on top of bloat but it won’t save you if you’re too far gone already

If things are bad, try pg_repack It copies live rows to a new table, then does a quick switch in the system catalogs to make this new table take the place of the old It’s way better than the built-in alternative, VACUUM FULL, because repack does not require access-exclusive locks pg_repack does require an access exclusive lock during the quick switch though If it cannot get the lock it starts canceling queries or killing back-ends Repack also takes a lot of space (a duplicate table’s worth)

Another approach is flexible freeze by PG Experts. It’s a vacuum, but it’s an extra aggressive vacuum It’s designed to be run as a cron job during lulls in traffic to start to catch up on bloat debt You can force flexible freeze to cancel its work but it will have to start from scratch next time you run it

Dealing with bloated indices Flexible freeze does not fix indices like pg_repack One fix is to build an index that is identical to the bloated one and then drop the bloated on (you can do it all in a single transaction) A regular create index will take a lock on your table, but you can also run create index concurrently which locks only ddl However you’ll want to check that the index built correctly In psql run \d+ my_table and look at the indices for any marked “invalid.” You can programmatically check for invalid indices as well The other limitation of the index swap method is that you can’t use it for primary key indices

Preventing future bloat Auto-vacuum and its settings max_workers is number of concurrent vacuum processes you can have running Certainly safe to raise to 5 or 6 on most databases, higher for really big databases Each worker gets maintenance_work_mem amount of memory, and generally a sixteenth of available ram is appropriate The default is 16mb which can be seriously too low! But if you double the number of workers to six then use an eighth of your ram instead How often is auto-vacuum run? Starts for every n rows written where n = autovacuum_vacuum_scale_factor * [ pg_class_reltuples for table] + autovacuum_vacuum_threshold For huge tables the default value of autovacuum_vacuum_scale_factor = 0.1 will lead to serious bloat situations For huge tables set the scale factor to 0.02 (you can set it per-table) and crank up the threshold a little autovacuum_vacuum_cost_delay adjustment tips There are analogous settings for the auto-analyze daemon and its settings are quite cheap to raise But not need to raise them unless you’re noticing bad query planning

Transaction ID (XID) overflow xmin , xmax are monotonically increasing The underlying number is a 32-bit int If it overflows then Postgres cannot tell the past from the future and deleted rows would come back from the dead! Total corruption It’s so bad that Postgres will shut down to prevent it from happening and will not start up until you manually address the problem The moral of the story is don’t turn auto-vacuum off autovacuum freeze searches for rows so old that no transaction can see them and replaces their xmin with a magic value This buys you time You can measure how close you are to XID wraparound using some scripts. Vacuum freeze parameter overview See Josh Berkus’ article for more info



Related Posts