During AusHac, I worked on the container hierarchy I discussed in my previous post, which culminated in the initial release of container-classes. I had initially (and naively) thought I would have been able to whip something like this together on the Friday afternoon and spend the rest of the weekend working on graph libraries; in the end I just managed to release an initial draft version before we had to pack up on Sunday.

Now, I’m not saying this current setup is perfect; it’s basically a direct copy of all list-oriented functions from the Prelude along with a couple of functions from Data.List split into a generic Container class, a Sequence class for containers with a linear structure and Stream for infinite Sequence s (i.e. lists and similar structures: those for which it makes sense to define a function like repeat).

First of all, here are a couple of design decisions I made with this library:

I want to be able to consider types with kind * ; as such, most pre-existing classes are of no use.

; as such, most pre-existing classes are of no use. Even when hacking together support for types of kind * -> * for mapping functions, etc. I couldn’t use Functor as it doesn’t let you constrain the type of the values being stored (for Sets, etc.).

for mapping functions, etc. I couldn’t use as it doesn’t let you constrain the type of the values being stored (for Sets, etc.). To be able to have restrictions, we need to be able to specify the value type as part of the class definition. This means the usage of either MPTCs+fundeps or an Associated Type. I was initially using the latter, but due to the current lack of superclass constraints making the type signatures much uglier and longer, I switched to using MPTCs+fundeps instead.

Type signatures should be as short/nice as possible.

Provide as many default implementations as possible, and make those as efficient as possible.

However, with these design decisions there are some considerations I have to make:

How should I split up the various functions into type-classes? e.g. does it make sense to re-define standard classes like Foldable so that they’ll work with values of kind * (where possible) and if necessary have a constrained value type?

so that they’ll work with values of kind (where possible) and if necessary have a constrained value type? At the moment, the main constraints are all inherited from the Container class; if I have lots of smaller classes, is there a nicer way of abstracting out the constraint without duplicating it everywhere? rmonad has the Suitable, but in practice this seems to mean the addition of extra Suitable f a constraints on every function in addition; maybe this is because Suitable isn’t a superclass of the other classes though.

class; if I have lots of smaller classes, is there a nicer way of abstracting out the constraint without duplicating it everywhere? rmonad has the Suitable, but in practice this seems to mean the addition of extra constraints on every function in addition; maybe this is because isn’t a superclass of the other classes though. I’ve tried to define the default definitions of the various class methods with the eventual goal of implementing and using the foldr/build rule, but I’m not sure how to properly implement such a rule, let alone how well it’s going to work in practice: If someone overrides the defaults to use custom/optimised versions (e.g. all the pre-defined list functions), then the inter-datatype optimisations will no longer be present. As people may use more optimised variants of various class methods, any data type that extends another (using a newtype, etc.) will have to explicitly define each class instance rather than relying on the default definitions (if they want to keep using the optimised variants). Cross-container optimisation could change some fundamental assumptions: e.g. going from a list to a Seq and then back to a list will typically preserve the value ordering; however if we replace the Seq with a Set then we’d expect the ordering in the final list to have changed (and be sorted); if I implement the foldr/build rule would it interfere with this ordering by removing the intermediate Set and the fact that it will insert values in sorted order?

Benchmarking: is there a nice way of doing per-class benchmarking to be able to compare the performance of different data structures? For example, being able to compare how long it takes to insert 100 random values into a set (by consing) compared to inserting those same values into a Set .

So, that seems to be the battle I’ve taken upon myself. I’d greatly appreciate any pointers people can give me either as comments here or by emailing me.

Oh, and a reminder: I’m going to stop collecting responses for my survey on what to call the “new FGL” library at about 12 PM UTC this Friday. I’ve already got about 60 votes; more are welcome.