If you are running a PostgreSQL database for your app, chances are you’ll want to run backups.

Because good boys run backups. And you’re a good boy 🐶.

Now, backing-up a PostgreSQL database from a bash shell is pretty easy. Just invoke pg_dump like this:

pg_dump -h localhost -U postgres my_database

You’ll also want to compress your backup. You can easily save a lot of disk space by gzipping your dump file:

pg_dump -h localhost -U postgres my_database | gzip > backup.gz

But what about doing your database backup using Python? Let me show you a couple of ways you can achieve this.

1) Using subprocess

import gzip

import subprocess with gzip.open(‘backup.gz’, ‘wb’) as f:

popen = subprocess.Popen(cmd, stdout=subprocess.PIPE, universal_newlines=True)



for stdout_line in iter(popen.stdout.readline, “”):

f.write(stdout_line.encode(‘utf-8’))



popen.stdout.close()

popen.wait()

Phew, that’s a lot of code compared to the bash alternative. One thing that’s nice about this snippet is that you’ll be streaming the output of pg_dump line by line to be written to backup.gz . This means that for large databases, the memory usage is going to stay very small, as you’re not going to load the whole dump in memory at once.

2) Using sh

import gzip

from sh import pg_dump with gzip.open(‘backup.gz’, ‘wb’) as f:

pg_dump(‘-h’, ‘localhost’, ‘-U’, ‘postgres’, ‘my_dabatase’, _out=f)

Looks cleaner, eh? sh is a nifty little library that aims at making calling subprocesses easier. If you don’t mind depending on yet another third party library, then sh comes in highly recommended. Notice how you just import pg_dump as if it were an actual Python module. Just magic. Like the example using subprocess , you won’t blow up your RAM, as the output of pg_dump is going to be streamed to the backup.gz file.

3) Using delegator.py

import gzip

import delegator with gzip.open(‘backup.gz’, ‘wb’) as f:

c = delegator.run(‘pg_dump -h localhost -U postgres my_database’)

f.write(c.out.encode(‘utf-8’))

delegator.py is a single module that was created by none other than Kenneth Reitz, the author of the requests lib. The snippet above involves less magic than the sh example, but you’ll have to encode the string to bytes before writing to the compressed file. Unlike the subprocess or sh examples, this method will gradually use more memory as your database grows, thus I would only use it for small-ish databases. I haven’t found any way of circumventing this problem, so if you have a tip, let me know!

4) Using pexpect

import gzip

import pexpect with gzip.open(‘backup.gz’, ‘wb’) as f:

c = pexpect.spawn(‘pg_dump -h localhost -U postgres my_database)

f.write(c.read())

Doing some researches on subprocess alternatives, I stumbled upon pexpect . Since it seemed like a quite popular lib, I could not skip it. I like how the code is straightforward. Bonus, no need to encode a string before writing to the gzipped file! Like the delegator.py example, this snippet will probably blow up your memory usage if you’re not careful. Still, the code is pretty clean, so that’s a plus!

5) Using plumbum

import gzip

from plumbum.cmd import pg_dump with gzip.open(‘backup.gz’, ‘wb’) as f:

(pg_dump[“-h”, “localhost”, “-U”, “postgres”, “my_database”] > f)()

MAGIC OVERLOAD! I like how this snippet uses the > operator to mimic the behaviour of IO redirection in bash. This library has many nice examples of IO redirection using pipe operators. Like the 2 previous examples, this method will not stream the dump to the gzipped file gradually, so again, only adequate for smaller databases.

Wrap up