Fast Random Strings Generation in Haskell

Posted in: fp, haskell.

Yesterday morning there was an interesting question on Haskell Cafe:





The first thing I thought was to switch to ByteString : since we don’t need to deal with UTF-8 encoding (the strings to generate were plain ASCII strings in the range a-z ) we can avoid paying the overhead Data.Text introduces. After that, the code was a slightly faster, but nothing thrilling (about 7 seconds on my machine).

The crucial hint was gave from Gregory Collins, who suggested that System.Random was very slow. He proposed, as better alternative, the mwc-random package, and the fact Bos was the author was a guarantee. After a bit of struggling I ended up with the following code:

The are a couple of interesting points:

As stated inside the docs withSystemRandom is a somewhat expensive function, and is intended to be called only occasionally ( e.g. once per thread). You should use the Gen it creates to generate many random numbers). What we are doing is confining the Gen generation is the “inside loop”. This guarantee that we get a brand new Gen to pass to each invocation of genString .

I’ve extrapolated c2w8 from the MissingH library from John Goerzen. We need this because uniformR expects a type which is instance of Variate , which is defined for Word8 but not for Char .

As stated on Reddit and in the comments below, unfoldrN would have been a great alternative to replicateM to build our random string. Unfortunately, the signature of unfoldrN reveals a pure nature: unfoldr :: (a -> Maybe (Word8, a)) -> a -> ByteString , whereas uniformR returns a monadic computation. In my opinion, albeit slower, replicateM allows the creation of a new string without any visual clutter, keeping our code clean.

Not only is our code cleaner, but is also a great deal faster! On my machine generating 10000 words takes more or less half a second!

Well, a lot faster than PHP, as expected.

Comments

Please enable JavaScript to view the comments powered by Disqus.

Loved this post? Stay update