I spent some time today trying to talk to a MySQL database server from a piece of middleware I'm writing in Haskell. You might think that talking to a database server would be easy, but it turned out to be quite a bother.

Both of the major MySQL bindings, HDBC-mysql and HDBC-odbc, use the libmysqlclient C library behind the scenes. With GHC's unthreaded runtime, which is still the default, an application using either will work fine. However, my middleware app is highly concurrent and uses software transactional memory (STM) to manage some shared state, and I have to use the threaded runtime. This is where my troubles began.

The symptom I observed was that I couldn't even connect to a database:

SqlError { seState = "", seNativeError = 2003, seErrorMsg = "Can't connect to MySQL server on 'xxxxx' (4)"}

After enough years of dealing with MySQL, you pick up some useful nuggets such as "the number in parentheses at the end of certain kinds of error message is a Unix errno value" (the library doesn't provide any other way to see what errno caused a failure, amusingly enough). The number 4 is EINTR , indicating that a system call was being interrupted.

I split my development time between a Mac and a Linux laptop, and today's hacking was on a Mac, so I fired up dtruss to see what was wrong:

dtruss -b128m myapp

(I'd much preferred to have been using Linux here. dtruss is vastly inferior to strace , and in fact in its default configuration, it doesn't work at all! That -b128m is necessary to give its kernel component enough of a scratchpad that it won't run out of space while sampling.)

The interrupted system call was connect , and sure enough, reading the library source code, we can see that the problem lies in the my_connect function:

/*

If they passed us a timeout of zero, we should behave

exactly like the normal connect() call does.

*/



if (timeout == 0 )

return connect(fd, ( struct sockaddr*) name, namelen);



The comment is more or less accurate, but the library should be more careful in its use of the connect function: the caller of my_connect doesn't check for EINTR , and so the connection will fail if the thread receives a signal.

Why is the thread receiving a signal in the first place, though? GHC's threaded RTS sets up either a SIGALRM or SIGVTALRM signal to perform some internal book-keeping at a fairly high frequency, and it's the arrival of this signal that interrupts connect . Failure to check for EINTR and retry is a widespread problem in C code that uses system calls directly.

To work around this, I wrote a simple module that masks the RTS signals that the MySQL client library fails to handle, then performs an action. It ensures that it's running in a bound thread (GHC terminology for a lightweight thread that's tied to a heavyweight system thread) for the duration of the action.