In Part Two, we started designing the PostgreSQL controller. Today, we pick up where we left off and start going into the details of how the control layer (controller and sidecars included) keeps track of the state of the PostgreSQL application.

The real-world state of the PostgreSQL application has many components:

The current state ( status ) of each PostgreSQL instance: Is it healthy? Is it the primary (master) or a standby (slave)? Is it still being configured?

) of each PostgreSQL instance: Is it healthy? Is it the primary (master) or a standby (slave)? Is it still being configured? The desired state ( spec ) of each PostgreSQL instance: Is it supposed to be the primary? Where should it stream WAL (write-ahead log) from?

) of each PostgreSQL instance: Is it supposed to be the primary? Where should it stream WAL (write-ahead log) from? The current state of the application as a whole: Which instances belong to this PostgreSQL application? Where are backups written?

The desired state of the application as a whole: How many instances should there be? Where do backups go?

This real-world state is encoded in the central store, which the control layer uses to make decisions about what it needs to do. Each element of the control layer is responsible for maintaining part of the encoded state:

The application administrator , be it a person or another application, is responsible for setting the desired state of the application as a whole.

, be it a person or another application, is responsible for setting the desired state of the application as a whole. The controller maintains the current state of the application as a whole and the desired states of the individual PostgreSQL instances.

maintains the current state of the application as a whole and the desired states of the individual PostgreSQL instances. Each PostgreSQL instance is controlled by a sidecar, which is responsible for maintaining the instance’s current state.

PostgreSQL Instance State

In Part One, we walked through some of the behaviors PostgreSQL instances must implement. For example, when the primary starts, it does the following:

Initialize the database, possibly from a preexisting backup. Configure continuous backup. Configure streaming replication, including authentication mechanism.

How we encode the current state of an instance depends on whether we bundle the three actions above as a single step. If it’s a single step, the state transition might look like this:

Not set up Done setting up

With this encoding, the controller can’t determine which setup step the instance is in. Depending on the overall design, this might be fine. If the controller needs more visibility into an instance’s setup process, the state transitions might look like this instead:

Not set up Database initialized Backup configured Done setting up

In both these designs, the desired state set by the controller is this:

Done setting up

The instance sidecar does its best to move its current state to the desired state.

Instance Failure

So far, I’ve claimed that the PostgreSQL instance sidecar is responsible for maintaining the current state encoded in the central store. This isn’t entirely true.

When a PostgreSQL instance Pod fails (because the Node failed or it was evicted or it just crashed) the current state in the central store should say that the instance failed. The sidecar is part of the Pod, so if the Pod dies, there’s no running sidecar to update the central store.

The solution is to use an external agent to update the central store when an instance fails. One way to do this is by implementing a Pod health check. Then the PostgreSQL controller can inspect the Pod status for each of its instances to see which ones are failing. In this model, Kubernetes and the instance sidecar are jointly responsible for maintaining an instance’s current state in the central store.

PostgreSQL Whole-Application State

The previous section was about instance-level state. Before we dive into application-level state, I want to address a key difference between the controller and an instance sidecar.

When an instance dies, it stays dead. When the controller dies, a new controller takes its place.

Earlier, we had a choice between encoding instance setup as either a single state transition from Not done to Done or multiple transitions from Not done to Step 1 done to Step 2 done to Done . This choice exists because the sidecar doesn’t have to rely on the instance’s current state in the central store. It can simply hold the current state in memory.

On the other hand, the controller does rely on the application’s current state in the central store. If the controller dies, its replacement must be able to continue where it left off. For example, consider the failover process:

State: Healthy (1 primary, 2 standby) → Action: None State: Unhealthy (0 primary, 2 standby) → Action: Set one standby’s desired state to “primary”. State: Healthy (1 primary, 1 standby) → Action: Create a new standby State: Healthy (1 primary, 2 standby) → Action: None

Note that each current state corresponds to a specific action (given a particular desired state).

If the controller dies in step 2, what does the replacement controller do? Did the previous controller perform the Action or not? If it’s safe to retry the action (i.e. the action is idempotent), then the replacement controller can simply perform the action regardless of whether the original controller did. Otherwise, the set of possible states must distinguish those two scenarios.

For step 2 above, the Action is simply to write a desired state in the central store. This action can be idempotent, since the replacement controller can write the same desired state as its predecessor. The alternative is to add an intermediate State between steps 2 and 3:

2. State: Unhealthy (0 primary, 2 standby) → Action: Set one standby’s desired state to “primary”. 2.5. State: Unhealthy (0 primary, 2 standby) with pending promotion for standby #1 → Action: None 3. State: Healthy (1 primary, 1 standby) → Action: Create a new standby

Note that this approach requires the Action and the state change to happen together atomically. Otherwise, the controller could fail after step 2’s Action and before updating the state to step 2.5. The replacement controller would think that step 2’s Action hadn’t happened yet.

In our example case here, the Action is to set an instance’s desired state. If the application’s current state includes every instance’s desired state, the Action and state change do occur together atomically. For other kinds of Actions, this won’t hold true. It’s better to make sure those actions are idempotent.

What’s left?

So far, we’ve covered in broad strokes how to coordinate the lifecycle of PostgreSQL instances in a highly available PostgreSQL application on Kubernetes. If you’d like more detail in a particular area, let me know in the comments! There’s a lot to cover, so help me choose the next part to write about.

Making sure PostgreSQL is running isn’t the end of our concerns, though. We still have to take care of observability, load balancing, managing the backup archive, and more. Stay tuned.