Previously, we saw how the principle of “making illegal states unrepresentable” allowed LH to easily enforce a key invariant in Joachim Breitner’s library for representing sets of integers as sorted lists of intervals.

However, Hs-to-coq let Breitner specify and verify that his code properly implemented a set library. Today, lets see how LH’s new “type-level computation” abilities let us reason about the sets of values corresponding to intervals, while using the SMT solver to greatly simplify the overhead of proof.

(Click here to demo)

42: {-@ LIQUID "--short-names" @-} 43: {-@ LIQUID "--exact-data-con" @-} 44: {-@ LIQUID "--no-adt" @-} 45: {-@ LIQUID "--higherorder" @-} 46: {-@ LIQUID "--diff" @-} 47: {-@ LIQUID "--ple" @-} 48: 49: module RangeSet where 50: 51: import Prelude hiding ( min , max ) 52: import Language . Haskell . Liquid . NewProofCombinators

Intervals

Recall that the key idea is to represent sets of integers like

as ordered lists of intervals

where each pair (i, j) represents the set {i, i+1,..., j-1} .

To verify that the implementation correctly implements a set data type, we need a way to

Specify the set of values being described, Establish some key properties of these sets.

Range-Sets: Semantics of Intervals

We can describe the set of values corresponding to (i.e. “the semantics of”) an interval i, j by importing the Data.Set library

88: import qualified Data . Set as S

to write a function rng i j that defines the range-set i..j

The reflect rng tells LH that we are going to want to work with the Haskell function rng at the refinement-type level.

Equational Reasoning

To build up a little intuition about the above definition and how LH reasons about Sets, lets write some simple unit proofs. For example, lets check that 2 is indeed in the range-set rng 1 3 , by writing a type signature

116: {-@ test1 :: () -> { S.member 2 (rng 1 3) } @-}

Any implementation of the above type is a proof that 2 is indeed in rng 1 3 . Notice that we can reuse the operators from Data.Set (here, S.member ) to talk about set operations in the refinement logic. Lets write this proof in an equational style:

the “proof” uses two library operators:

e1 === e2 is an implicit equality that checks e1 is indeed equal to e2 after unfolding functions at most once , and returns a term that equals e1 and e2 , and

e *** QED converts any term e into a proof.

The first two steps of test1 , simply unfold rng and the final step uses the SMT solver’s decision procedure for sets to check equalities over set operations like S.union , S.singleton and S.member .

Reusing Proofs

Next, lets check that:

We could do the proof by unfolding in the equational style. However, test1 already establishes that S.member 2 (rng 1 3) and we can reuse this fact using:

e1 ==? e2 ? pf an explicit equality which checks that e1 equals e2 because of the extra facts asserted by the Proof named pf (in addition to unfolding functions at most once) and returns a term that equals both e1 and e2 .

Proof by Logical Evaluation

Equational proofs like test1 and test2 often have long chains of calculations that can be tedious to spell out. Fortunately, we taught LH a new trick called Proof by Logical Evaluation (PLE) that optionally shifts the burden of performing those calculations onto the machine. For example, PLE completely automates the above proofs:

Be Warned! While automation is cool, it can be very helpful to first write out all the steps of an equational proof, at least while building up intuition.

Proof by Induction

At this point, we have enough tools to start proving some interesting facts about range-sets. For example, if x is outside the range i..j then it does not belong in rng i j :

216: {-@ lem_mem :: i : _ -> j : _ -> x : {x < i || j <= x} -> 217: { not (S.member x (rng i j)) } / [ j i ] 218: @-}

We will prove the above “by induction”. A confession: I always had trouble understanding what exactly proof by induction really meant. Why was it it ok to “do” induction on one thing but not another?

Induction is Recursion

Fortunately, with LH, induction is just recursion. That is,

We can recursively use the same theorem we are trying to prove, but We must make sure that the recursive function/proof terminates.

The proof makes this clear:

There are two cases.

Base Case: As i >= j , we know rng i j is empty, so x cannot be in it.

Inductive Case As i < j we can unfold rng i j and then recursively call lem_mem (i+1) j to obtain the fact that x cannot be in i+1..j to complete the proof.

LH automatically checks that the proof:

Accounts for all cases, as otherwise the function is not total i.e. like the head function which is only defined on non-empty lists. (Try deleting a case at the demo to see what happens.) Terminates, as otherwise the induction is bogus, or in math-speak, not well-founded. We use the explicit termination metric / [j-i] as a hint to tell LH that in each recursive call, the size of the interval j-i shrinks and is always non-negative. LH checks that is indeed the case, ensuring that we have a legit proof by induction.

Proof by Evaluation

Once you get the hang of the above style, you get tired of spelling out all the details. Logical evaluation lets us eliminate all the boring calculational steps, leaving the essential bits: the recursive (inductive) skeleton

The above is just lem_mem sans the (PLE-synthesized) intermediate equalities.

Disjointness

We say that two sets are disjoint if their intersection is empty :

Lets prove that two intervals are disjoint if the first ends before the second begins:

318: {-@ lem_disj :: i1 : _ -> j1 : _ -> i2 : {j1 <= i2} -> j2 : _ -> 319: {disjoint (rng i1 j1) (rng i2 j2)} / [ j2 i2 ] 320: @-}

This proof goes “by induction” on the size of the second interval, i.e. j2-i2 :

Here, the operator pf1 &&& pf2 conjoins the two facts asserted by pf1 and pf2 .

Again, we can get PLE to do the boring calculations:

Splitting Intervals

Finally, we can establish the splitting property of an interval i..j , that is, given some x that lies between i and j we can split i..j into i..x and x..j . We define a predicate that a set s can be split into a and b as:

We can now state and prove the splitting property as:

(We’re using PLE here quite aggressively, can you work out the equational proof?)

Set Operations

The splitting abstraction is a wonderful hammer that lets us break higher-level proofs into the bite sized pieces suitable for the SMT solver’s decision procedures.

Subset

An interval i1..j1 is enclosed by i2..j2 if i2 <= i1 < j1 <= j2 . Lets verify that the range-set of an interval is contained in that of an enclosing one.

406: {-@ lem_sub :: i1 : _ -> j1 : {i1 < j1} -> 407: i2 : _ -> j2 : {i2 < j2 && i2 <= i1 && j1 <= j2 } -> 408: { S.isSubsetOf (rng i1 j1) (rng i2 j2) } 409: @-}

Here’s a “proof-by-picture”. We can split the larger interval i2..j2 into smaller pieces, i2..i1 , i1..j1 and j1..j2 one of which is the i1..j1 , thereby completing the proof:





The intuition represented by the picture can distilled into the following proof, that invokes lem_split to carve i2..j2 into the relevant sub-intervals:

Union

An interval i1..j1 overlaps i2..j2 if i1 <= j2 <= i2 , that is, if the latter ends somewhere inside the former. The same splitting hammer lets us compute the union of two overlapping intervals simply by picking the interval defined by the endpoints.

446: {-@ lem_union :: 447: i1 : _ -> j1 : {i1 < j1} -> 448: i2 : _ -> j2 : {i2 < j2 && i1 <= j2 && j2 <= j1 } -> 449: { rng (min i1 i2) j1 = S.union (rng i1 j1) (rng i2 j2) } 450: @-}





The pictorial proof illustrates the two cases:

i1..j1 encloses i2..j2 ; here the union is just i1..j1 , i1..j1 only overlaps i1..j1 ; here the union is i2..j1 which can be split into i2..i1 , i1..j2 and j2..j1 which are exactly the union of the intervals i1..j1 and i2..j2 .

Again, we render the picture into a formal proof as:

Intersection

Finally, we check that the intersection of two overlapping intervals is given by their inner-points.

489: {-@ lem_intersect :: 490: i1 : _ -> j1 : {i1 < j1} -> 491: i2 : _ -> j2 : {i2 < j2 && i1 <= j2 && j2 <= j1 } -> 492: {rng (max i1 i2) j2 = S.intersection (rng i1 j1) (rng i2 j2)} 493: @-}





We have the same two cases as for lem_union

i1..j1 encloses i2..j2 ; here the intersection is just i2..j2 , i1..j1 only overlaps i1..j1 ; here the intersection is the middle segment i1..j2 , which we obtain by splitting i1..j1 at j2 , splitting i2..j2 at i1 , discarding the end segments which do not belong in the intersection.

Conclusions

Whew. That turned out a lot longer than I’d expected!

On the bright side, we saw how to:

Specify the semantics of range-sets, Write equational proofs using plain Haskell code, Avoid boring proof steps using PLE, Verify key properties of operations on range-sets.

Next time we’ll finish the series by showing how to use the above lemmas to specify and verify the correctness of Breitner’s implementation.

Please enable JavaScript to view the comments powered by Disqus.