21st November 2008, 09:41 pm

Last year I stumbled across a simple representation for partial information about values, and wrote about it in two posts, A type for partial values and Implementing a type for partial values. Of particular interest is the ability to combine two partial values into one, combining the information present in each one.

More recently, I played with unambiguous choice, described in the previous post.

This post combines these two ideas. It describes how to work with partial values in Haskell natively, i.e., without using any special representation and without the use restrictions of unambiguous choice. I got inspired to try removing those restrictions during stimulating discussions with Thomas Davie, Russell O’Connor others in the #haskell gang.

You can download and play with the library shown described here. There are links and a bit more info on the lub wiki page.

Edits:

2008-11-22: Fixed link: Implementing a type for partial values

2008-11-22: Tidied introduction

Information, more or less

The meanings of programming languages are often defined via a technique called “denotational semantics”. In this style, one specifies a mathematical model, or semantic domain, for the meanings of utterances in the and then writes what looks like a recursive functional program that maps from syntax to semantic domain. That “program” defines the semantic function. Really, there’s one domain and one semantic function each syntactic category. In typed languages like Haskell, every type has an associated semantic domain.

One of the clever ideas of the theory of semantic domains (“domain theory”) is to place a partial ordering on values (domain members), based on information content. Values can not only be equal and unequal, they can also have more or less information content than each other. The value with the least information is at the bottom of this ordering, and so is called “bottom”, often written as “⊥”. In Haskell, ⊥ is the meaning of “ undefined “. For succinctness below, I’ll write “⊥” instead of “ undefined ” in Haskell code.

Many types have corresponding flat domains, meaning that the only values are either completely undefined completely defined. For instance, the Haskell type Integer is (semantically) flat. Its values are all either ⊥ or integers.

Structured types are not flat. For instance, the meaning of (i.e., the domain corresponding to) the Haskell type (Bool,Integer) contains five different kinds of values, as shown in the figure. Each arrow leads from a less-defined (less informative) value to a more-defined value.

To handle the diversity of Haskell types, define a class of types for which we know how to compute lubs.

class HasLub a where (⊔) :: a -> a -> a

The actual lub library, uses “ lub ” instead of “ (⊔) “.

The arguments must be consistent, i.e., must have a common upper bound. This precondition is not checked statically or dynamically, so the programmer must take care.

Flat types

For flat types, (⊔) is equivalent to unamb (see Functional concurrency with unambiguous choice), so

-- Flat types: instance HasLub () where (⊔) = unamb instance HasLub Bool where (⊔) = unamb instance HasLub Char where (⊔) = unamb instance HasLub Integer where (⊔) = unamb instance HasLub Float where (⊔) = unamb ...

Pairs

Too strict

We can handle pairs easily enough:

instance (HasLub a, HasLub b) => HasLub (a,b) where (a,b) ⊔ (a',b') = (a ⊔ a', b ⊔ b')

but we’d be wrong! This definition is too strict, e.g.,

(1,2) ⊔ ⊥ == ⊥

when we’d want (1,2) .

Too lazy

No problem. Just make the patterns lazy:

instance (HasLub a, HasLub b) => HasLub (a,b) where ~(a,b) ⊔ ~(a',b') = (a ⊔ a', b ⊔ b')

Oops — wrong again. This definition is too lazy:

⊥ ⊔ ⊥ == (⊥,⊥)

when we’d want ⊥.

Almost

We can fix the too-lazy version by checking that one of the arguments is non-bottom, which is what seq does. Which one to we check? The one that isn’t ⊥ , or either one if they’re both defined. Our friend unamb can manage this task:

instance (HasLub a, HasLub b) => HasLub (a,b) where ~p@(a,b) ⊔ ~p'@(a',b') = (p `unamb` p') `seq` (a ⊔ a', b ⊔ b')

But there’s a catch (which I realized only now): p and p' may not satisfy the precondition on unamb . (If they did, we could use (⊔) = unamb .)

Just right

To fix this last problem, check whether each pair is defined. We can’t know which to check first, so test concurrently, using unamb .

instance (HasLub a, HasLub b) => HasLub (a,b) where ~p@(a,b) ⊔ ~p'@(a',b') = (definedP p `unamb` definedP p') `seq` (a ⊔ a', b ⊔ b')

where

definedP :: (a,b) -> Bool definedP (_,_) = True

The implicit second case in that definition is definedP ⊥ = ⊥ .

Some examples:

*Data.Lub> (⊥,False) ⊔ (True,⊥) (True,False) *Data.Lub> (⊥,(⊥,False)) ⊔ ((),(⊥,⊥)) ⊔ (⊥,(True,⊥)) ((),(True,False))

Sums

For sums, we’ll discriminate between lefts and rights, with the following assistants:

isL :: Either a b -> Bool isL = either (const True) (const False) outL :: Either a b -> a outL = either id (error "outL on Right") outR :: Either a b -> b outR = either (error "outR on Left") id

The (⊔) method for sums unwraps its arguments as a Left or as a Right , after checking which kind of arguments it gets. We can’t know which one to check first, so check concurrently:

instance (HasLub a, HasLub b) => HasLub (Either a b) where s ⊔ s' = if isL s `unamb` isL s' then Left (outL s ⊔ outL s') else Right (outR s ⊔ outR s')

Exercise: Why is the use of unamb here legal?

Functions

A function f is said to be less (or equally) defined than a function g when f is less (or equally) defined g for every argument value. Consequently,

instance HasLub b => HasLub (a -> b) where f ⊔ g = a -> f a ⊔ g a

More succinctly:

instance HasLub b => HasLub (a -> b) where (⊔) = liftA2 (⊔)

Other types

We’ve already handled the unit type () , other flat types, pairs, sums, and functions. Algebraic data types can be modeled via this standard set, with a technique from generic programming. Define methods that map to and from a type in the standard set.

class HasRepr t r | t -> r where -- Required: unrepr . repr == id repr :: t -> r -- to repr unrepr :: r -> t -- from repr

We can implement (⊔) on a type by performing a (⊔) on that type’s standard representation, as follows:

repLub :: (HasRepr a v, HasLub v) => a -> a -> a a `repLub` a' = unrepr (repr a ⊔ repr a') instance (HasRepr t v, HasLub v) => HasLub t where (⊔) = repLub

However, this rule would overlap with all other HasLub instances, because Haskell instance selection is based only on the head of an instance definition, i.e., the part after the “ => “. Instead, we’ll define a HasLub instance per HasRepr instance.

For instance, here are encodings for Maybe and for lists:

instance HasRepr (Maybe a) (Either () a) where repr Nothing = (Left ()) repr (Just a) = (Right a) unrepr (Left ()) = Nothing unrepr (Right a) = (Just a) instance HasRepr [a] (Either () (a,[a])) where repr [] = (Left ()) repr (a:as) = (Right (a,as)) unrepr (Left ()) = [] unrepr (Right (a,as)) = (a:as)

And corresponding HasLub instances:

instance HasLub a => HasLub (Maybe a) where (⊔) = repLub instance HasLub a => HasLub [a] where (⊔) = repLub

Testing:

*Data.Lub> [1,⊥,2] ⊔ [⊥,3,2] [1,3,2]

Last year I stumbled across a simple representation for partial information about values, and wrote about it in two posts, A type for partial values and Implementing a type for...