19th December 2008, 12:11 am

I realized in the shower this morning that there’s a serious flaw in my unamb implementation as described in Functional concurrency with unambiguous choice. Here’s the code for racing two computations:

race :: IO a -> IO a -> IO a a `race` b = do v <- newEmptyMVar ta <- forkPut a v tb <- forkPut b v x <- takeMVar v killThread ta killThread tb return x forkPut :: IO a -> MVar a -> IO ThreadId forkPut act v = forkIO ((act >>= putMVar v) `catch` uhandler `catch` bhandler) where uhandler (ErrorCall "Prelude.undefined") = return () uhandler err = throw err bhandler BlockedOnDeadMVar = return ()

The problem is that each of the threads ta and tb may have spawned other threads, directly or indirectly. When I kill them, they don’t get a chance to kill their sub-threads. If the parent thread does get killed, it will most likely happen during the takeMVar .

My first thought was to use some form of garbage collection of threads, perhaps akin to Henry Baker’s paper The Incremental Garbage Collection of Processes. As with memory GC, dropping one consumer would sometimes result is cascading de-allocations. That cascade is missing from my implementation above.

Or maybe there’s a simple and dependable manual solution, enhancing the method above.

I posted a note asking for ideas, and got the following suggestion from Peter Verswyvelen:

I thought that killing a thread was basically done by throwing a ThreadKilled exception using throwTo. Can’t these exception be caught? In C#/F# I usually use a similar technique: catch the exception that kills the thread, and perform cleanup.

Playing with Peter’s suggestion works out very nicely, as described in this post.

There is a function that takes a clean-up action to be executed even if the main computation is killed:

finally :: IO a -> IO b -> IO a

Using this function, the race definition becomes a little shorter and more descriptive:

a `race` b = do v <- newEmptyMVar ta <- forkPut a v tb <- forkPut b v takeMVar v `finally` (killThread ta >> killThread tb)

This code is vulnerable to being killed after the first forkPut and before the second one, which would then leave the first thread running. The following variation is a bit safer:

a `race` b = do v <- newEmptyMVar ta <- forkPut a v (do tb <- forkPut b v takeMVar v `finally` killThread tb) `finally` killThread ta

Though I guess it’s still possible for the thread to get killed after the first fork and before the next statement begins. Also, this code difficult to write and read. The general pattern here is to fork a thread, do something else, and kill the thread. Give that pattern a name:

forking :: IO () -> IO b -> IO b forking act k = do tid <- forkIO act k `finally` killThread tid

The post-fork action in both cases is to execute another action ( a or b ) and put the result into the mvar v . Removing the forkIO from forkPut , leaves putCatch :

putCatch :: IO a -> MVar a -> IO () putCatch act v = (act >>= putMVar v) `catch` uhandler `catch` bhandler where uhandler (ErrorCall "Prelude.undefined") = return () uhandler err = throw err bhandler BlockedOnDeadMVar = return ()

Combe forking and putCatch for convenience:

forkingPut :: IO a -> MVar a -> IO b -> IO b forkingPut act v k = forking (putCatch act v) k

Or, in the style of Semantic editor combinators,

forkingPut = (result.result) forking putCatch

Now the code is tidy and safe:

a `race` b = do v <- newEmptyMVar forkingPut a v $ forkingPut b v $ takeMVar v

Recall that there’s a very slim chance of the parent thread getting killed after spinning a child and before getting ready to kill the sub-thread (i.e., the finally ). If this case happens, we will not get an incorrect result. Instead, an unnecessary thread will continue to run and write its result into an mvar that no one is reading.

I realized in the shower this morning that there’s a serious flaw in my unamb implementation as described in Functional concurrency with unambiguous choice. Here’s the code for racing two...