[Haskell-cafe] Combining Regions and Iteratees

Regions is an automatic resource management technique that statically ensures that all allocated resources are freed and a freed resource cannot be used. Regions also promote efficiency by helping to structure the computation so that resources will be freed soon, and en masse. Therefore, regions are particularly suitable for scarce resources such as file handles, database or network connections, etc. A lightweight monadic region library is available on Hackage. Iteratee IO also aims, among other things, to encapsulate resources such as file handles and network connections, ensuring their safe use and prompt disposal. One may wonder how much Iteratees and Regions have in common and if that commonality can be factored out. There seem to be several ways to combine regions and iteratees. This message describes the most straightforward attempt, combining a monadic region library (mostly as it is) with an Iteratee IO library, also mostly as it is. We use monadic regions to manage file handles or file descriptors, ensuring that file handles are always closed even in case of IO and other asynchronous exceptions. An enumerator like enumFile provided similar guarantees for its handles. (Since an enumerator keeps its handles to itself, there is no danger of iteratees' misusing them.) With the monadic region library, the enumerator code becomes simpler: we no longer have to worry about exceptions. The main benefit of monadic region library is to manage files opened by iteratees. The latter being passed around, and so their resources are harder to keep track. We thus demonstrate enumFile and iterFile for incremental file reading and writing, with the same safety guarantees. All opened files are *always* closed, regardless of any (asynchronous) exceptions that may arise during opening, reading, writing or transforming. The code has many examples of throwing errors from various stages of the pipeline and at various times. All files are closed. The commented code is available at http://okmij.org/ftp/Haskell/Iteratee/IterReg.hs which uses the lightweight monadic regions library from http://okmij.org/ftp/Computation/resource-aware-prog/ Since lightweight monadic library needs rank-2 types (now standard), it seemed appropriate to avail ourselves to common GHC extensions. We can clearly see that enumerators and enumeratees unify, both being instances of a general type forall a. Iteratee e mi a -> mo (Iteratee e mi a) which is a Monoid. To compose enumerators or enumeratees, we use the standard mappend. An alias in the code type R e m = Iteratee e m suggests that Iteratees are the view from the right -- the System.IO, getChar-like view. From that point of view, Iteratee IO is hardly different from System.IO (getChar, getLine, peekChar, etc). The dual newtype L e mi mo = L{unL :: forall a. R e mi a -> mo (R e mi a)} is the view from the left. Here are a few examples from the IterReg code. The first simply copies one file to another, block-by-clock. tIF1 = runSIO $ run =<< unL (enumFile "/etc/motd") (iterFile "/tmp/x") According to the trace opened file /etc/motd iterFile: opening /tmp/x Closing {handle: /etc/motd} Closing {handle: /tmp/x} the files are indeed closed, but _not_ in the LIFO order. That is important, so to let an iteratee write data coming from several sources. For example: tIF3 = runSIO $ run =<< unL (enumFile "/etc/motd" `mappend` enumFile "/usr/share/dict/words") (iterFile "/tmp/x") opened file /etc/motd iterFile: opening /tmp/x Closing {handle: /etc/motd} opened file /usr/share/dict/words Closing {handle: /usr/share/dict/words} Closing {handle: /tmp/x} The files will be closed even in case of exceptions: tIF4 = runSIO $ run =<< unL (enumFile "/etc/motd" `mappend` enumFile "/nonexistent") (iterFile "/tmp/x") opened file /etc/motd iterFile: opening /tmp/x Closing {handle: /etc/motd} opened file /nonexistent Closing {handle: /tmp/x} *** Exception: /nonexistent: openFile: does not exist All monadic region monads all support shCatch, so we can write our own exception-handling code. Other examples in IterReg.hs raise errors during data transformation. Monadic regions plus GHC extensions simplify code. For example, here are iterFile and enumFile (the signatures could be omitted; they will be inferred) iterFile :: (SMonad1IO m, m ~ (IORT s' m')) => FilePath -> R ByteString m () iterFile fname = lift (newSHandle fname WriteMode) >>= loop where loop h = getChunk >>= check h check h (Chunk s) = lift (mapM (shPut h) s) >> loop h check h e = return () enumFile :: (SMonadIO m) => FilePath -> L ByteString m m enumFile filepath = L $ \iterv -> do newRgn $ do h <- newSHandle filepath ReadMode unL (enumHandle h) iterv