Stop (ab)using CPP in Haskell sources



Yuras Shumovich February 1, 2015

What is wrong with CPP

CPP is a C preprocessor, but it is common to use it in Haskell. That leads to a number of issues.

It can mess with haskell code.

CPP doesn’t understand Haskell code, instead it assumes C code. It is free to remove insignificant (for C, not for Haskell) whitespace, expand macros in Haskell comments and strings or mess with identifiers that contain ' or # .

It leads to unnecessary recompilation.

Every time you change your .cabal file, e.g. add new module, or update dependencies, cabal regenerates cabal-macros.h file. Then the recompilation checker pessimistically decides to recompile all modules with CPP enabled.

It makes automatic code analyzing and transforming harder.

If you use hlint or HaRe , then you probably know what I mean.

When abused, it makes code harder to read.

It is not rare to see code intercalated with ifdefs that specify different behaviour for different platforms of library versions. Sometimes that is unavoidable though.

Most of the time CPP can be avoided or minimized. The most important tool here is abstraction.

Abstract over specific details

It is not Haskell specific, abstracting is widely used to minimize CPP in C. When you need different behaviour based on the current platform or the version of some dependencies, try to abstract over the difference instead of inlining platform specific code.

At the first glance it may look impossible to do. In such cased I usually simply duplicate code and then refactor it to reduce duplication.

Some times it is convenient to start with an umbrella module that provides a unified interface for the rest of program, and a number of platform specific implementations. Note: you don’t need CPP to select a specific module, cabal lets you conditionally include modules based on the platform or other conditions.

Example: fsnotify

An excellent example of such an approach is the fsnotify package. It defines specific implementations for linux, osx and win32, and one umbrella module. A number of other modules contain common code, so duplication is really minimal.

Note that CPP is enabled only in the umbrella module for two reasons. First off all, it is used to import the specific implementation:

#ifdef OS_Linux import System.FSNotify.Linux #else # ifdef OS_Win32 import System.FSNotify.Win32 # else # ifdef OS_Mac import System.FSNotify.OSX # else type NativeManager = PollManager # endif # endif #endif

That can be avoided too. To do that we can give the same name to platform specific modules but move them into separate directories, linux , osx and win32 . Then manipulate the hs-source-dirs field in cabal file to select the correct implementation. (Make sure to add other implementations to extra-source-files to make sure cabal sdist will copy them into the tarball.)

-- in System.FSNotify: import System.FSNotify.Platform -- in fsnotify.cabal: extra - source - files : linux / System / FSNotify / Platform.hs osx / System / FSNotify / Platform.hs win32 / System / FSNotify / Platform.hs hs - source - dirs : src if os(linux) hs - source - dirs : linux if os(darwin) hs - source - dirs : osx if os(windows) hs - source - dirs : win32

The other use of CPP is to define forkFinally that is missing in older base :

#if !MIN_VERSION_base(4,6,0) forkFinally :: IO a -> ( Either SomeException a -> IO ()) -> IO ThreadId forkFinally action and_then = mask $ \restore -> forkIO $ try (restore action) >>= and_then #endif

The same technique can be used to avoid CPP here. I personally prefer to hide such snippets into a custom prelude and don’t bother with hs-source-path .

(I don’t think CPP should be avoided at all costs, I think that the amount of CPP used in fsnotify is a good compromise. I just used it as a real world example of how to avoid CPP.)

More posts

Atom feed