As part of recent type-refactoring efforts in Haskell, a discussion about adding Semigroup as a parent class of Monoid has been bouncing around the mailing list. From a theoretical point of view, this is a great idea: it is more flexible than the current approach that would allow for greater expressibility.

From a practical point of view, however, I am inclined to oppose it. Not because it is in itself a bad change—it's a very reasonable change that has advantages for new code—but because I have, in the past, had to update large systems written in Haskell after GHC updates, and therefore I know that this kind of change has a cost. The Applicative-Monad changes seem to have made way for the Foldable-Traversable Prelude, but those have in turn inspired a number of suggestions for modifications to the Haskell standard library, each one of which, taken on its own, is reasonable, but taken as a mass, mean much more work for maintainers. This is especially true if we continue on this path of making minor refactorings at each release: each year a project will need changes, or it will quickly bit-rot beyond utility.

Default Superclass Instances

There is, however, an alternative I would like to discuss—one which has also been brought up on the mailing list, but which I'd like to provide more concrete motivation for. Some of these changes—especially the Semigroup / Monoid split—seem to involve taking the functionality of a class and splitting it into multiple smaller classes. For example, we can think of the Semigroup / Monoid change as converting

class Monoid m where mempty :: m mappend :: m -> m -> m

into

class Semigroup m where mappend :: m -> m -> m class Semigroup m => Monoid m where mempty :: m

Something that has been proposed before (in a few different forms) and which I suggest be more actively considered if changes like these are to become common is to allow superclass instances to be declared within a subclass declaration. This would allow you to write a single instance declaration for a class, and in that body also include implementations for methods belong to a superclass of that class. As an example, perhaps we could be able to write :

newtype MyId a = MyId a instance Monad MyId where -- Functor method fmap f (MyId x) = MyId (f x) -- Applicative methods pure = MyId MyId f <*> MyId x = MyId (f x) -- Monad methods return = MyId MyId x >>= f = f x

For the Monoid / Semigroup proposal, this would mean that any Monoid instances that exist would continue to work unchanged, but new instances could (optionally) split apart their declarations. Under this proposal, either of these would be acceptable:

class Semigroup m where mappend :: m -> m -> m class Semigroup m => Monoid m where mempty :: m -- combined `instance` declarations: instance Monoid [a] where mempty = [] mappend = (++)

or, equivalently,

class Semigroup m where mappend :: m -> m -> m class Semigroup m => Monoid m where mempty :: m -- split apart `instance` declarations instance Semigroup [a] where mappend = (++) instance Monoid [a] where mempty = []

And because the Monoid declaration for [] is already written like the former, we can make the Semigroup / Monoid split without having to rewrite the instance declarations!

Exciting Additional Opportunities

Because this lowers the cost of updating for new versions, various other useful changes might be considered that would otherwise involve far too much breakage. For example, we could consider splitting Num apart into small constituent parts. With Default Superclass Instances, we could refactor Num into the following without changing any instance declarations:

class Add a where (+) :: a -> a -> a class Sub a where (-) :: a -> a -> a class Add a => Zero a where zero :: a class Mul a where (*) :: a -> a -> a class Mul a => One a where one :: a class (Zero a, One a) => FromInteger a where fromInteger :: Integer -> a instance Zero a where zero = fromInteger 0 instance One a where one = fromInteger 1 class Signed a where negate :: a -> a abs :: a -> a signum :: a -> a class ( Eq a, Show a, Add a, Sub a , Mul a, FromInteger a, Signed a) => Num a where

which would allow certain numeric types to only implement a subset of the relevant operations:

data Nat = Zero | Succ Nat instance Add Nat where Z + y = y S x + y = S (x + y) {- et cetera --- but no implementations for e.g. Signed, - which is not meaningful for `Nat`! -}

and also allow current Num functions to have a looser set of constraints than they do at present:

sum :: (Zero a, Add a) => [a] -> a sum (x:xs) = x + sum xs sum [] = zero prod :: (One a, Mul a) => [a] -> a prod (x:xs) = x * prod xs prod [] = one

We could also consider splitting Arrow into distinct components, again without having to change any instance declarations:

class Category a => Pairwise a where first :: a b c -> a (b, d) (c, d) second :: a b c -> a (d, b) (d, c) (***) :: a b c -> a b' c' -> a (b, b') (c, c') (&&&) :: a b c -> a b c' -> a b (c, c') class Pairwise a => Arrow a where arr :: (b -> c) -> a b c

or even (dare I dream) splitting Bits into something that is not a 22-method monstrosity!

Potential Drawbacks

On the other hand, this proposal does have some down-sides:

Grepping for Instance Declarations

Right now, I can often find an instance declaration for a type T by grepping for instance C T (modulo some line noise) whereas with this change, it's possible that there is no declaration for instance C T , because all of C 's methods are declared by a subclass C' instead. The compiler ought to be able to deal with this without problem, which means that tools like Haddock documentation should somewhat alleviate this problem, but a user might be confused.

Introduces New Possible Errors

The declarations below are of course nonsensical, and would be rejected by the compiler—-but the fact that this change would introduce new failure conditions at all is a drawback of the proposal.

instance Semigroup Int where mappend = (+) instance Monoid Int where -- this conflicts with an existing declaration mappend = (*) mempty = 1

A Pragma-Less Extension

In order to be really useful, we'd want to use this without a LANGUAGE pragma. After all, we're arguing for it on the basis of preserving backwards-compatibility, but that argument is much less compelling if we still have to change the source files to make use of it! On the other hand, that means we'd have included a GHC extension that takes effect despite not being enabled, which is also a worrying precedent!

It still might be a useful extension even if it had to be enabled by a LANGUAGE pragma, as it is easier to add said pragma to a source file than to manually break apart dozens of instance declarations, but it makes this feature less compelling in general.

In Conclusion

As I said before, my primary objection to typeclass changes like those above—-even if they are good changes for newly written code—-is that they break existing code. I don't want to have to modify a handful of miscellaneous instance declarations on a yearly basis as people discover new levels of abstraction or become dissatisfied with current choices, especially as those changes will grow more extensive as I build more projects in Haskell! But with an extension like this, we could grow the typeclass ecosystem gradually and fix what we see as past warts while maintaining backwards-compatibility, which would be a very powerful tool to have.