Patroni is a tool for deploying PostgreSQL servers in high-availability configurations. Read on to get your feet wet with Patroni.

This post assumes you are familiar with PostgreSQL streaming replication, as well as replication toplogies.

Patroni

Patroni is a fork of the now-unmaintained Governor from Compose. It is open source (GitHub repo) and is documented here. Among other things, Patroni supports bootstrapping, synchronous standbys, automatic failover and maximum replication lag limit on failover.

Patroni needs a distributed configuration manager for coordinating it’s actions, and supports ZooKeeper, etcd and consul. Patroni itself is written in Python and available on PyPi.

Let’s see how to setup an etcd-based Patroni to manage a primary-standby PostgreSQL cluster. Practically, you’d have a 3-node or 5-node etcd cluster, and the primary and standby on separate servers. However, we’re only interested in getting a working Patroni setup so we can explore further, so we’ll set everything up on a single server.

The commands below are for Debian 9, you should be able to use similar or same commands on other distros too.

Setup etcd

etcd is available from it’s GitHub releases page. Let’s download it and run it:

wget https://github.com/coreos/etcd/releases/download/v3.3.2/etcd-v3.3.2-linux-amd64.tar.gz tar xvf etcd-v3.3.2-linux-amd64.tar.gz cd etcd-v3.3.2-linux-amd64 ./etcd

This will start and run etcd in the foreground. (Use ^C to exit.) etcd is being run as a single-node cluster, which is useless from a redundancy perspective but is good enough for experimentation and development use.

Install Patroni

Patroni and it’s dependencies are installed via pip :

sudo pip install patroni[etcd]

The command is to be typed in verbatim, including the square brackets.

If you don’t have pip , you can install it with apt-get :

sudo apt-get install python-pip

On RHEL/CentOS-based distros, you’ll need to “ sudo yum install epel-release python-pip gcc ” before you can pip install .

Installing Postgres

Obviously, the Postgres binaries must be present on the system. You can install the Postgres that your distro bundles, or follow the instructions on the Postgres download page. Patroni does not appear to be too dependent on the Postgres version.

For this example, we’ll use the 9.6 version included in Debian 9:

sudo apt-get install postgresql-9.6

Which ever method and version you choose, make a note of the path for the Postgres binaries, we’ll need this later. In this case, the path is:

/usr/lib/postgresql/9.6/bin

Patroni Configuration Files

Patroni is invoked with the path to a configuration file. We’ll have two configuration files, one for the primary and one for the slave. We’ll run both on the same server, on different ports (5432 and 5433).

We’ll call the nodes “pg-1” and “pg-2”. They each have their own configuration files. Here is the first one, “pg-1.yml”:

scope : my-ha-cluster name : pg-1 restapi : listen : 0.0.0.0:8008 connect_address : 127.0.0.1:8008 etcd : host : 127.0.0.1:2379 bootstrap : dcs : ttl : 30 loop_wait : 10 retry_timeout : 10 maximum_lag_on_failover : 1048576 postgresql : use_pg_rewind : true use_slots : true parameters : wal_level : replica hot_standby : " on" wal_keep_segments : 8 max_wal_senders : 5 max_replication_slots : 5 checkpoint_timeout : 30 initdb : - encoding : UTF8 pg_hba : - host all dba all md5 - host replication repl all md5 users : dba : password : secret options : - createrole - createdb repl : password : secret options : - replication postgresql : listen : 0.0.0.0:5432 connect_address : 127.0.0.1:5432 data_dir : /data/pg-1 config_dir : /data/pg-1 bin_dir : /usr/lib/postgresql/9.6/bin authentication : replication : username : repl password : secret superuser : username : dba password : secret parameters : unix_socket_directories : ' /tmp'

Here’s an explanation of the fields:

scope : This is the name of the Patroni-managed cluster, which can be used in patronictl to refer to your cluster. All nodes should have the same value for this key.

: This is the name of the Patroni-managed cluster, which can be used in to refer to your cluster. All nodes should have the same value for this key. name : Name of the node, unique within the cluster.

: Name of the node, unique within the cluster. restapi : Patroni has a REST API, which is started at this address (“listen”). The “connect_address” is the address with which other nodes can connect to this API, so the IP here should be the IP by which this node can be reached from other nodes (typically over a private VLAN).

: Patroni has a REST API, which is started at this address (“listen”). The “connect_address” is the address with which other nodes can connect to this API, so the IP here should be the IP by which this node can be reached from other nodes (typically over a private VLAN). etcd : The configuration to connect to the etcd cluster. For a 3-node etcd cluster, use “hosts: ip1:port1, ip2:port, ip3:port3”.

: The configuration to connect to the etcd cluster. For a 3-node etcd cluster, use “hosts: ip1:port1, ip2:port, ip3:port3”. bootstrap : These values are used when a Patroni cluster is created. The values under “postgresql.parameters” are actual postgresql.conf configuration parameters. Some of the values, like wal_level and max_wal_senders are required for streaming replication to work.

: These values are used when a Patroni cluster is created. The values under “postgresql.parameters” are actual configuration parameters. Some of the values, like and are required for streaming replication to work. initdb : When bootstrapping the first node of a cluster and the PostgreSQL data directory does not exist, these parameters will be used to invoke initdb .

: When bootstrapping the first node of a cluster and the PostgreSQL data directory does not exist, these parameters will be used to invoke . pg_hba : The entries that Patroni will add to the pg_hba.conf file of the database it creates. See the “users” section below.

: The entries that Patroni will add to the file of the database it creates. See the “users” section below. users : Patroni creates the list of users specified here. These users (who should have been given access in pg_hba above) are then used in the postgresql.authentication section below to let Patroni login to the Postgres server. Here, the users “dba” (for admin access by Patroni) and “repl” (for replication access from standby’s) are created.

: Patroni creates the list of users specified here. These users (who should have been given access in above) are then used in the section below to let Patroni login to the Postgres server. Here, the users “dba” (for admin access by Patroni) and “repl” (for replication access from standby’s) are created. postgresql: These parameters contain a bunch of information about the PostgreSQL server (“pg-1”) managed by this Patroni node. The IP in “connect_address” should be the IP from which other servers can reach this server (typically over a private VLAN). We’re using 127.0.0.1 here since everything is on the same node. The bin_dir is what we picked up earlier, and we’re setting data and config dirs to “/data/pg-1”. For now, this directory does not exist (“/data” should though). The “authentication” parameters should refer to the replication and admin users that we created above in the “users” section. Finally, the “parameters” section again contains postgresql.conf configuration parameters that Patroni will pass to pg_ctl to start the database.

And here is the second file, “pg-2.yml”:

scope : my-ha-cluster name : pg-2 restapi : listen : 0.0.0.0:8009 connect_address : 127.0.0.1:8009 etcd : host : 127.0.0.1:2379 bootstrap : dcs : ttl : 30 loop_wait : 10 retry_timeout : 10 maximum_lag_on_failover : 1048576 postgresql : use_pg_rewind : true use_slots : true parameters : wal_level : replica hot_standby : " on" wal_keep_segments : 8 max_wal_senders : 5 max_replication_slots : 5 checkpoint_timeout : 30 initdb : - encoding : UTF8 pg_hba : - host all dba all md5 - host replication repl all md5 users : dba : password : secret options : - createrole - createdb repl : password : secret options : - replication postgresql : listen : 0.0.0.0:5433 connect_address : 127.0.0.1:5433 data_dir : /data/pg-2 config_dir : /data/pg-2 bin_dir : /usr/lib/postgresql/9.6/bin authentication : replication : username : repl password : secret superuser : username : dba password : secret parameters : unix_socket_directories : ' /tmp' port : 5433

It looks, and is, mostly similar to the first file. There are only a few changes:

name : This has to be different, and this one we’re naming as “pg-2”.

: This has to be different, and this one we’re naming as “pg-2”. restapi : The ports are different, since we’re running 2 Patroni’s on the same node. This is typically not the case.

: The ports are different, since we’re running 2 Patroni’s on the same node. This is typically not the case. postgresql: The ports have been changed from 5432 to 5433. The data directory is also now “/data/pg-2”. In the “parameters” section also, the port is specified as 5433.

These configuration files have more options, you can read more about them here. We’re only using a minimal set to bring up a working cluster.

Starting the Cluster

Note that we don’t have any actual databases yet. We’re going to let Patroni invoke initdb for us, using the parameters from the configuration files.

We already have etcd running. Let’s also make sure any default PostgreSQL service is stopped first:

sudo systemctl stop postgresql

Now we can start the first node, by calling the “patroni” script with the path to the first configuration file:

$ patroni pg-1.yml 2018-03-13 07:47:05,853 INFO: Failed to import patroni.dcs.consul 2018-03-13 07:47:05,867 INFO: Selected new etcd server http://127.0.0.1:2379 2018-03-13 07:47:05,878 INFO: Lock owner: None; I am pg-1 2018-03-13 07:47:05,882 INFO: trying to bootstrap a new cluster The files belonging to this database system will be owned by user "vagrant". This user must also own the server process. The database cluster will be initialized with locale "en_US.UTF-8". The default text search configuration will be set to "english". Data page checksums are disabled. creating directory /data/pg-1 ... ok creating subdirectories ... ok selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting dynamic shared memory implementation ... posix creating configuration files ... ok running bootstrap script ... ok performing post-bootstrap initialization ... ok syncing data to disk ... ok WARNING: enabling "trust" authentication for local connections You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb. Success. You can now start the database server using: /usr/lib/postgresql/9.6/bin/pg_ctl -D /data/pg-1 -l logfile start 2018-03-13 07:47:08,850 INFO: postmaster pid=1658 localhost:5432 - no response LOG: database system was shut down at 2018-03-13 07:47:08 GMT LOG: MultiXact member wraparound protections are now enabled LOG: autovacuum launcher started LOG: database system is ready to accept connections localhost:5432 - accepting connections localhost:5432 - accepting connections 2018-03-13 07:47:09,916 INFO: establishing a new patroni connection to the postgres cluster 2018-03-13 07:47:09,926 INFO: running post_bootstrap 2018-03-13 07:47:09,940 WARNING: Could not activate Linux watchdog device: "Can't open watchdog device: [Errno 2] No such file or directory: '/dev/watchdog'" 2018-03-13 07:47:09,948 INFO: initialized a new cluster 2018-03-13 07:47:19,943 INFO: Lock owner: pg-1; I am pg-1 2018-03-13 07:47:19,951 INFO: Lock owner: pg-1; I am pg-1 2018-03-13 07:47:19,960 INFO: no action. i am the leader with the lock 2018-03-13 07:47:29,992 INFO: Lock owner: pg-1; I am pg-1 2018-03-13 07:47:30,000 INFO: no action. i am the leader with the lock

As you can see, it has invoked initdb to create a new database at /data/pg-1 , and started the server. You can connect to the server using psql -h /tmp -U dba postgres at this point.

Now let’s bring up the standby, by running “patroni” with the second configuration file, in another terminal:

$ patroni pg-2.yml 2018-03-13 07:58:55,411 INFO: Failed to import patroni.dcs.consul 2018-03-13 07:58:55,425 INFO: Selected new etcd server http://127.0.0.1:2379 2018-03-13 07:58:55,438 INFO: Lock owner: pg-1; I am pg-2 2018-03-13 07:58:55,441 INFO: trying to bootstrap from leader 'pg-1' 2018-03-13 07:58:55,962 INFO: replica has been created using basebackup 2018-03-13 07:58:55,963 INFO: bootstrapped from leader 'pg-1' 2018-03-13 07:58:55,999 INFO: postmaster pid=1752 localhost:5433 - no response LOG: database system was interrupted; last known up at 2018-03-13 07:58:55 GMT LOG: entering standby mode LOG: redo starts at 0/2000028 LOG: consistent recovery state reached at 0/20000F8 LOG: database system is ready to accept read only connections FATAL: could not start WAL streaming: ERROR: replication slot "pg_2" does not exist FATAL: could not start WAL streaming: ERROR: replication slot "pg_2" does not exist localhost:5433 - accepting connections localhost:5433 - accepting connections 2018-03-13 07:58:57,075 INFO: Lock owner: pg-1; I am pg-2 2018-03-13 07:58:57,075 INFO: does not have lock 2018-03-13 07:58:57,075 INFO: establishing a new patroni connection to the postgres cluster 2018-03-13 07:58:57,090 INFO: no action. i am a secondary and i am following a leader 2018-03-13 07:58:59,958 INFO: Lock owner: pg-1; I am pg-2 2018-03-13 07:58:59,958 INFO: does not have lock 2018-03-13 07:58:59,962 INFO: no action. i am a secondary and i am following a leader LOG: started streaming WAL from primary at 0/3000000 on timeline 1

Patroni has taken a backup using pg_basebackup , setup streaming replication and brought up the PostgreSQL server. To verify, you can try connecting to the standby with psql -h /tmp -p 5433 -U dba postgres . We’d specified “hot_standby” as “on”, so read-only connections to the standby are possible.

Using Patronictl

Patroni includes a command called “patronictl” which can be used to control the cluster. Let’s check the status of the cluster:

$ patronictl -d etcd://127.0.0.1:2379 list my-ha-cluster +---------------+--------+-----------+--------+---------+-----------+ | Cluster | Member | Host | Role | State | Lag in MB | +---------------+--------+-----------+--------+---------+-----------+ | my-ha-cluster | pg-1 | 127.0.0.1 | Leader | running | 0.0 | | my-ha-cluster | pg-2 | 127.0.0.1 | | running | 0.0 | +---------------+--------+-----------+--------+---------+-----------+

Manual Failover

The patronictl command can be used to do a manual failover. Since pg-1 is the current master, we want to promote pg-2 . Let’s do that with:

$ patronictl -d etcd://127.0.0.1:2379 failover my-ha-cluster Candidate ['pg-2'] []: pg-2 Current cluster topology +---------------+--------+-----------+--------+---------+-----------+ | Cluster | Member | Host | Role | State | Lag in MB | +---------------+--------+-----------+--------+---------+-----------+ | my-ha-cluster | pg-1 | 127.0.0.1 | Leader | running | 0.0 | | my-ha-cluster | pg-2 | 127.0.0.1 | | running | 0.0 | +---------------+--------+-----------+--------+---------+-----------+ Are you sure you want to failover cluster my-ha-cluster, demoting current master pg-1? [y/N]: y 2018-03-13 08:06:27.49165 Successfully failed over to "pg-2" +---------------+--------+-----------+--------+---------+-----------+ | Cluster | Member | Host | Role | State | Lag in MB | +---------------+--------+-----------+--------+---------+-----------+ | my-ha-cluster | pg-1 | 127.0.0.1 | | stopped | unknown | | my-ha-cluster | pg-2 | 127.0.0.1 | Leader | running | | +---------------+--------+-----------+--------+---------+-----------+

Looks like the failover happened. Let’s list the nodes again to confirm:

$ patronictl -d etcd://127.0.0.1:2379 list my-ha-cluster +---------------+--------+-----------+--------+---------+-----------+ | Cluster | Member | Host | Role | State | Lag in MB | +---------------+--------+-----------+--------+---------+-----------+ | my-ha-cluster | pg-1 | 127.0.0.1 | | running | 0.0 | | my-ha-cluster | pg-2 | 127.0.0.1 | Leader | running | 0.0 | +---------------+--------+-----------+--------+---------+-----------+

The node pg-2 has now indeed become the master. The Patroni output for pg-2 should contain log lines like:

2018-03-13 08:09:07,939 INFO: Lock owner: pg-2; I am pg-2 2018-03-13 08:09:07,952 INFO: no action. i am the leader with the lock

Automatic Failover

When the current leader, pg-2 , goes off the network, becomes unreachable, or if the node crashes, then the Patroni at pg-1 should take over and promote pg-1 . Let’s see if that happens by killing the pg-2 Patroni process with a ^C :

2018-03-13 08:18:12,469 INFO: Lock owner: pg-2; I am pg-2 2018-03-13 08:18:12,477 INFO: no action. i am the leader with the lock ^CLOG: received fast shutdown request LOG: aborting any active transactions FATAL: terminating connection due to administrator command LOG: autovacuum launcher shutting down LOG: shutting down LOG: database system is shut down 2018-03-13 08:18:17,008 INFO: Lock owner: pg-2; I am pg-2 $

Over in pg-1 ’s terminal, we can see it promoting itself:

2018-03-13 08:18:12,476 INFO: Lock owner: pg-2; I am pg-1 2018-03-13 08:18:12,476 INFO: does not have lock 2018-03-13 08:18:12,479 INFO: no action. i am a secondary and i am following a leader LOG: replication terminated by primary server DETAIL: End of WAL reached on timeline 4 at 0/5000838. FATAL: could not send end-of-streaming message to primary: no COPY in progress LOG: invalid record length at 0/5000838: wanted 24, got 0 FATAL: could not connect to the primary server: could not connect to server: Connection refused Is the server running on host "127.0.0.1" and accepting TCP/IP connections on port 5433? 2018-03-13 08:18:17,030 WARNING: request failed: GET http://127.0.0.1:8009/patroni (('Connection aborted.', error(104, 'Connection reset by peer'))) 2018-03-13 08:18:17,162 INFO: promoted self to leader by acquiring session lock server promoting LOG: received promote request LOG: redo done at 0/50007C8 2018-03-13 08:18:17,173 INFO: cleared rewind state after becoming the leader LOG: selected new timeline ID: 5 LOG: archive recovery complete LOG: MultiXact member wraparound protections are now enabled LOG: database system is ready to accept connections LOG: autovacuum launcher started 2018-03-13 08:18:18,449 INFO: Lock owner: pg-1; I am pg-1 2018-03-13 08:18:18,469 INFO: no action. i am the leader with the lock

and the node status confirms it:

$ patronictl -d etcd://localhost:2379 list my-ha-cluster +---------------+--------+-----------+--------+---------+-----------+ | Cluster | Member | Host | Role | State | Lag in MB | +---------------+--------+-----------+--------+---------+-----------+ | my-ha-cluster | pg-1 | 127.0.0.1 | Leader | running | 0.0 | +---------------+--------+-----------+--------+---------+-----------+

Onward!

The aim of this post was to get you up and running with a Patroni cluster on which you can start experimenting. Patroni can do much more than what we just did with it.

You can read more about Patroni in the docs, or go through the source here. The original Compose article here also has some information about how the solution works.

Monitoring PostgreSQL With OpsDash

With our own product, OpsDash, you can quickly start monitoring your PostgreSQL servers, and get instant insight into key performance and health metrics including replication stats.

Here’s a default, pre-configured dashboard for PostgreSQL.

OpsDash strives to save you the tedious work of setting up a useful dashboard. The metrics you see here were carefully chosen to represent the most relevant health and performance indicators for a typical PostgreSQL instance.

OpsDash understands the streaming replication feature of PostgreSQL and displays per-slave replication status on the master:

The replication lag (as a length of time) is also visible on a slave’s dashboard. In the graph below (part of the dashboard of a slave), we can see that the slave could not catch up to the master’s changes for a while.

Sign up for a free 14-day trial of OpsDash SaaS today!