May 8, 2014

Purely functional, elegant, correct, incremental and composable stream processing that is CPU and memory efficient. This is our (worthy) goal, but where do we start?

This problem space is being extensively explored across a variety of languages and libraries, each with subtly different trade-offs and not-so subtly different APIs and terminology. However, these libraries share common goals, and most share common ancestry from Oleg Kiselyov’s original Iteratee work or its Free Monad based derivatives.

This talk aims to build up an intuition for stream processing in general by first building up the core concepts and language of stream processing, and then grounding those by carefully examining the trade-offs and internals of several productionised implementations. Of particular interest are the pipes and conduits libraries from the Haskell community, and scalaz-stream from the Scala community.

This talk was presented at YOW! LambdaJam 2014. The talk was accompanied by a programming workshop which gave participants an opportunity to test drive the libraries in question.

[ deck ] [ code ]

References:

[1] Repository for workshop code:

[2] Oleg Kiselyov’s collection of Iteratee related resources:

[3] The “pipes” library in Haskell - emphasis on principled abstractions and composition:

[4] The “conduit” library in Haskell - emphasis on speed and resource management:

[5] The “scalaz-stream” library in Scala - emphasis on pure (non-monadic) processors and clean API, based on machines:

[6] The “machines” library in Haskell:

[7] The “machines” library in Scala:

[8] The “iteratee” library in Haskell - Oleg Kiselyov’s direct implementation:

[9] The “enumerator” library in Haskell - First attempt at improvements and better library support based on Oleg Kiselyov’s original work:

[10] Swierstra, W “Data types à la carte”-