Loop invariants can give you coding superpowers

yourbasic.org

An invariant is a state­ment about program vari­ables that is true every time the execu­tion of the program reaches the invariant.

When struggling with a tricky 3-way partition algorithm as a student, a friend suggested a well-fitting invariant. It was like getting a secret superpower: suddenly I could write code that used to be impossible.

In this text we’ll look at loop invariants, invariants placed at the beginning of a loop. They are a simple yet powerful tool to help understand iterative code.

A well-chosen loop invariant is useful both when designing, testing, and modifying code. It also serves as documentation and can be the foundation of a correctness proof.

Loop invariant definition

A loop invariant is a statement about program variables that is true before and after each iteration of a loop.

A good loop invariant should satisfy three properties:

Initialization: The loop invariant must be true before the first execution of the loop. Maintenance: If the invariant is true before an iteration of the loop, it should be true also after the iteration. Termination: When the loop is terminated the invariant should tell us something useful, something that helps us understand the algorithm.

Let’s start with a simple piece of code.

Algorithm sum(n) Input: a non-negative integer n Output: the sum 1 + 2 + … + n sum ← 0 i ← 1 while i ≤ n // Invariant: sum = 1 + 2 + … + (i - 1) sum ← sum + i i ← i + 1 return sum

In this example:

The loop invariant holds initially since sum = 0 and i = 1 at this point. (The empty sum is zero.) Assuming the invariant holds before the i th iteration, it will be true also after this iteration since the loop adds i to the sum, and increments i by one. When the loop is just about to terminate, the invariant states that sum = 1 + 2 + … + n , just what’s needed for the algorithm to be correct.

In fact, the three steps above constitute an induction proof, which shows that sum = 1 + 2 + … + n when the program leaves the loop.

Designing with invariants

We can use a loop invariant during the design of an algorithm. Let’s start with this code skeleton.

Algorithm max(A) Input: an array A storing n integers Output: the largest element in A max ← … for i = 0 to n-1 // Invariant: max = max(A[0], A[1], …, A[i-1]) … return max

The invariant seems to be well chosen: it satisfies the termination property since the variable max = max(A[0], A[1], …, A[n-1]) when the loop terminates.

We can try to satisfy the initialization property by assigning a suitable start value to max . However, it’s not clear how to define the maximum value of an empty slice. We’ve found a potential bug even before writing any code! One solution is to change the contract of the function.

Algorithm max(A) Input: an array A storing n integers, where n ≥ 1 Output: the largest element in A max ← … for i = 0 to n-1 // Invariant: max = max(A[0], A[1], …, A[i-1]) … return max

Now it’s easy to solve the problem. If we let max = A[0] and start the loop with i = 1 , the invariant holds when we first enter the loop.

Let’s fill in the code. When doing so we can assume that the invariant holds at the beginning of the loop. The maintenance property states: if the invariant is true before an iteration of the loop, it should be true also after the iteration. Here is the finished code.

Algorithm max(A) Input: an array A storing n integers, where n ≥ 1 Output: the largest element in A max ← A[0] for i = 1 to n-1 // Invariant: max = max(A[0], A[1], …, A[i-1]) if A[i] > max max ← A[i] return max

Sorting

Algorithm sort(A) Input: an array A storing n integers Output: the same array with the elements sorted in ascending order for j = 1 to n-1 // Invariant: A[0..j-1] contains the same elements as // the original subarray A[0..j-1], but in sorted order. key ← A[j] i ← j - 1 while i ≥ 0 and A[i] > key A[i+1] ← A[i] i ← i - 1 A[i+1] ← key

To achieve the termination property, the invariant should state that the array is sorted when the for loop terminates. Check.

Let’s look at the initialization property. We can assume that the slice contains at least two elements since the loop wouldn’t execute otherwise. (Nor is this needed since a slice with at most one element is sorted already.) Before executing the loop, the invariant states “ A[0..0] contains the same elements as the original subarray A[0..0] , but in sorted order”. This is clearly true.

To verify the maintenance property, we need to take a closer look at the code. We can assume that the invariant holds before the loop, i.e. A[0..j-1] is sorted, and only need to check that the code inserts the element A[j] in the correct position. This is achieved by moving all elements A[j-1], A[j-2],… larger than A[j] one step to the right and then inserting A[j] in its proper position.

3-way partition

The 3-way partition used in Quicksort is an intricate algorithm that would be quite hard to implement without a well-chosen loop invariant.

// Partition reorders the elements of v so that: // - all elements in v[:low] are less than p, // - all elements in v[low:high] are equal to p, // - all elements in v[high:] are greater than p. func Partition(v []int, p int) (low, high int) { low, high = 0, len(v) for mid := 0; mid < high; { // Invariant: // - v[:low] < p // - v[low:mid] = p // - v[mid:high] are unknown // - v[high:] > p // // < p = p unknown > p // ----------------------------------------------- // v: | | |a | | // ----------------------------------------------- // ^ ^ ^ // low mid high switch a := v[mid]; { case a < p: v[mid] = v[low] v[low] = a low++ mid++ case a == p: mid++ default: // a > p v[mid] = v[high-1] v[high-1] = a high-- } } return }

Further reading

See Top 3 Quicksort optimizations for a complete implementation of Quicksort.

Share this page: