5th December 2008, 12:14 am

The post Sequences, streams, and segments offered an answer to the the question of what’s missing in the following box:

infinite finite discrete Stream Sequence continuous Function ???

I presented a simple type of function segments, whose representation contains a length (duration) and a function. This type implements most of the usual classes: Monoid , Functor , Zip , and Applicative , as well Comonad , but not Monad . It also implements a new type class, Segment , which generalizes the list functions length , take , and drop .

The function type is simple and useful in itself. I believe it can also serve as a semantic foundation for functional reactive programming (FRP), as I’ll explain in another post. However, the type has a serious performance problem that makes it impractical for some purposes, including as implementation of FRP.

Fortunately, we can solve the performance problem by adding a simple layer on top of function segments, to get what I’ll call “signals”. With this new layer, we have an efficient replacement for function segments that implements exactly the same interface with exactly the same semantics. Pleasantly, the class instances are defined fairly simply in terms of the corresponding instances on function segments.

You can download the code for this post.

Edits:

2008-12-06: dup [] = [] near the end (was [mempty] ).

near the end (was ). 2008-12-09: Fixed take and drop default definitions (thanks to sclv) and added point-free variant.

and default definitions (thanks to sclv) and added point-free variant. 2008-12-18: Fixed appl , thanks to sclv.

, thanks to sclv. 2011-08-18: Eliminated accidental emoticon in the definition of dup , thanks to anonymous.

The problem with function segments

The type of function segments is defined as follows:

data t :-># a = FS t (t -> a)

The domain of the function segment is from zero up to but not including the given length.

An efficiency problem becomes apparent when we look at the Monoid instance:

instance (Ord t, Num t) => Monoid (t :-># a) where mempty = FS 0 (error "sampling empty 't :-># a'") FS c f `mappend` FS d g = FS (c + d) ( t -> if t <= c then f t else g (t - c))

Concatenation ( mappend ) creates a new segment that chooses, for every domain value t , whether to use one function or another. If the second, then t must be shifted backward, since the function is being shifted forward.

This implementation would be fine if we use mappend just on simple segments. Once we get started, however, we’ll want to concatenate lots & lots of segments. In FRP, time-varying values go through many phases (segments) as time progresses. Each quantity is described by a single “behavior” (sometimes called a “signal”) with many, and often infinitely many, phases. Imagine an infinite tree of concatenations, which is typical for FRP behaviors. At every moment, one phase is active. Every sampling must recursively discover the active phase and the accumulated domain translation (from successive subtractions) to apply when sampling that phase. Quite commonly, concatenation trees get progressively deeper on the right (larger t values). In that case, sampling will get slower and slower with time.

I like to refer to these progressive slow-downs as “time leaks”. There is also a serious space leak, since all of the durations and functions that go into a composed segment will be retained.

Sequences of segments

The problem above can be solved with a simple representation change. Instead of combining functions into functions, just keep a list of simple function segments.

-- | Signal indexed by t with values of type a. newtype t :-> a = S { unS :: [t :-># a] }

I’ll restrict in function segments to be non-empty, to keep the rest of the implementation simple and efficient.

This new representation allows for efficient monotonic sampling of signals. As old segments are passed up, they can be dropped.

What does it mean?

There’s one central question for me in defining any data type: What does it mean?

The meaning I’ll take for signals is function segments. This interpretation is made precise in a function that maps from the type to the model (meaning). In this case, simply concatenate all of the function segments:

meaning :: (Ord t, Num t) => (t :-> a) -> (t :-># a) meaning = mconcat . unS

Specifying the meaning of a type gives users a working model, and it defines correctness of implementation. It also tells me what class instances to implement and tells users what instances to expect. If a type’s meaning implements a class then I want the type to as well. Moreover, the type’s intances have to agree with the model’s instances. I’ve described this latter principle in Simplifying semantics with type class morphisms and some other posts.

Higher-order wrappers

To keep the code below short and clear, I’ll use some functions for adding and removing the newtype wrappers. These higher-order function apply functions inside of (:->) representations:

inS :: ([s :-># a] -> [t :-># b]) -> ((s :-> a) -> (t :-> b)) inS2 :: ([s :-># a] -> [t :-># b] -> [u :-># c]) -> ((s :-> a) -> (t :-> b) -> (u :-> c))

Using the trick described in Prettier functions for wrapping and wrapping, the definitions are simpler than the types:

inS = result S . argument unS inS2 = result inS . argument unS

Functor

The Functor instance applies a given function inside the function segments inside the lists:

instance Functor ((:->) t) where fmap h (S ss) = S (fmap (fmap h) ss)

Or, in the style of Semantic editor combinators,

instance Functor ((:->) t) where fmap = inS . fmap . fmap

Why this definition? Because it is correct with respect to the semantic model, i.e., the meaning of fmap is fmap of the meaning, i.e.,

meaning . fmap h == fmap h . meaning

which is to say that the following diagram commutes:

Proof:

meaning . fmap h == {- fmap definition -} meaning . S . fmap (fmap h) . unS == {- meaning definition -} mconcat . unS . S . fmap (fmap h) . unS == {- unS and S are inverses -} mconcat . fmap (fmap h) . unS == {- fmap h distributes over mappend -} fmap h . mconcat . unS == {- meaning definition -} fmap . meaning

Applicative and Zip

Again, the meaning functions tells us what the Applicative instance has to mean. We only get to choose how to implement that meaning correctly. The Applicative morphism properties:

meaning (pure a) == pure a meaning (bf <*> bx) == meaning bf <*> meaning bx

Our Applicative instance has a definition similar in simplicity and style to the Functor instance, assuming a worker function appl for (<*>) :

instance (Ord t, Num t, Bounded t) => Applicative ((:->) t) where pure = S . pure . pure (<*>) = inS2 appl appl :: (Ord t, Num t, Bounded t) => [t :-># (a -> b)] -> [t :-># a] -> [t :-># b]

This worker function is somewhat intricate. At least my implementation of it is, and perhaps there’s a simpler one.

Again, the meaning functions tells us what the Applicative instance has to mean. We only get to choose how to implement that meaning correctly.

First, if either segment list runs out, the combination runs out (because the same is true for the meaning of signals).

[] `appl` _ = [] _ `appl` [] = []

If neither segment list is empty, open up the first segment. Split the longer segment into a prefix that matches the shorter segment, and combine the two segments with (<*>) (on function segments). Toss the left-over piece back in its list, and continue.

(fs:fss') `appl` (xs:xss') | fd == xd = (fs <*> xs ) : (fss' `appl` xss' ) | fd < xd = (fs <*> xs') : (fss' `appl` (xs'':xss')) | otherwise = (fs' <*> xs ) : ((fs'':fss') `appl` xss') where fd = length fs xd = length xs (fs',fs'') = splitAt xd fs (xs',xs'') = splitAt fd xs

A Zip instance is easy as always with applicative functors:

instance (Ord t, Num t, Bounded t) => Zip ((:->) t) where zip = liftA2 (,)

Monoid

The Monoid instance:

instance Monoid (t :-> a) where mempty = S [] S xss `mappend` S yss = S (xss ++ yss)

Correctness follows from properties of mconcat (as used in the meaning function).

We’re really just using the Monoid instance for the underlying representation, i.e.,

instance Monoid (t :-> a) where mempty = S mempty mappend = inS2 mappend

Segment

The Segment class has length , take and drop . It’s handy to include also null and splitAt , both modeled after their counterparts on lists.

The new & improved Segment class:

class Segment seg dur | seg -> dur where null :: seg -> Bool length :: seg -> dur take :: dur -> seg -> seg drop :: dur -> seg -> seg splitAt :: dur -> seg -> (seg,seg) -- Defaults: splitAt d s = (take d s, drop d s) take d s = fst (splitAt d s) drop d s = snd (splitAt d s)

If we wanted to require dur to be numeric, we could add a default for null as well. This default could be quite expensive in some cases. (In the style of Semantic editor combinators, take = (result.result) fst splitAt , and similarly for drop .)

The null and length definitions are simple, following from to properties of mconcat .

instance (Ord t, Num t) => Segment (t :-> a) t where null (S xss) = null xss length (S xss) = sum (length <$> xss) ...

The null case says that the signal is empty exactly when there are no function segments. This simple definition relies on our restriction to non-empty function segments. If we drop that restriction, we’d have to check that every segment is empty:

-- Alternative definition null (S xss) = all null xss

The length is just the sum of the lengths.

The tricky piece is splitAt (used to define both take and drop), which must assemble segments to satisfy the requested prefix length. The last segment used might have to get split into two, with one part going into the prefix and one to the suffix.

splitAt _ (S []) = (mempty,mempty) splitAt d b | d <= 0 = (mempty, b) splitAt d (S (xs:xss')) = case (d `compare` xd) of EQ -> (S [xs], S xss') LT -> let (xs',xs'') = splitAt d xs in (S [xs'], S (xs'':xss')) GT -> let (S uss, suffix) = splitAt (d-xd) (S xss') in (S (xs:uss), suffix) where xd = length xs

Copointed and Comonad

To extract an element from a signal, extract an element from its first function segment. Awkwardly, extraction will fail (produce ⊥/error) when the signal is empty.

instance Num t => Copointed ((:->) t) where extract (S []) = error "extract: empty S" extract (S (xs:_)) = extract xs

I’ve exploited our restriction to non-empty function segments. Otherwise, extract would have to skip past the empty ones:

-- Alternative definition instance Num t => Copointed ((:->) t) where extract (S []) = error "extract: empty S" extract (S (xs:xss')) | null xs = extract (S xss') | otherwise = extract xs

The error/⊥ in this definition is dicey, as is the one for function segments. If we allow the same abuse in order to define a list Copointed , we can get an alternative to the first definition that is prettier but gives a less helpful error message:

instance Num t => Copointed ((:->) t) where extract = extract . extract . unS

See the closing remarks for more about this diciness.

Finally, we have Comonad , with its duplicate method.

duplicate :: (t :-> a) -> (t :-> (t :-> a))

I get confused with wrapping and unwrapping, so let’s separate the definition into a packaging part and a content part.

instance (Ord t, Num t) => Comonad ((:->) t) where duplicate = fmap S . inS dup

with content part:

dup :: (Ord t, Num t) => [t :-># a] -> [t :-># [t :-># a]]

The helper function, dup , takes each function segment and prepends each of its tails onto the remaining list of segments.

If the segment list is empty, then it has only one tail, also empty.

dup [] = [] dup (xs:xss') = ((: xss') <$> duplicate xs) : dup xss'

Closing remarks

The definitions above use the function segment type only through its type class interfaces, and so can all of them can be generalized. Several definitions rely on the Segment instance, but otherwise, each method for the composite type relies on the corresponding method for the underlying segment type. (For instance, fmap uses fmap , (<*>) uses (<*>) , Segment uses Segment , etc.) This generality lets us replace function segments with more efficient representations, e.g., doing constant propagation, as in Reactive. We can also generalize from lists in the definitions above.

Even without concatenation, function segments can become expensive when drop is repeatedly applied, because function shifts accumulate (e.g., f . (+ 0.01) . (+ 0.01) .... ). A more drop -friendly representation for function segments would be a function and an offset. Successive drops would add offsets, and extract would always apply the function to its offset. This representation is similar to the FunArg comonad, mentioned in The Essence of Dataflow Programming (Section 5.2).

The list-of-segments representation enables efficient monotonic sampling, simply by dropping after each sample. A variation is to use a list zipper instead of a list. Then non-monotonic sampling will be efficient as long as successive samplings are for nearby domain values. Switching to multi-directional representation would lose the space efficiency of unidirectional representations. The latter work well with lazy evaluation, often running in constant space, because old values are recycled while new values are getting evaluated.

Still another variation is to use a binary tree instead of a list, to avoid the list append in the Monoid instance for (t :-> a) . A tree zipper would allow non-monotonic sampling. Sufficient care in data structure design could perhaps yield efficient random access.

There’s a tension between the Copointed and Monoid interfaces. Copointed has extract , and Monoid has mempty , so what is the value of extract mempty ? Given that Copointed is parameterized over arbitrary types, the only possible answer seems to be ⊥ (as in the use of error above). I don’t know if the comonad laws can all hold for possibly-empty function segments or for possibly-empty signals. I’m grateful to Edward Kmett for helping me understand this conflict. He suggested using two coordinated types, one possibly-empty (a monoid) and the other non-empty (a comonad). I’m curious to see whether that idea works out.

The post Sequences, streams, and segments offered an answer to the the question of what’s missing in the following box: infinitefinite discreteStream Sequence continuousFunction ??? I presented a simple type...