Over the weekend I was playing around with a text processing script I wrote in Perl (and then ported to Ruby). This script is CPU bound and runs as a single thread. Interestingly the Ruby version is quite a bit faster than the Perl version, and I suspect that the speed difference is due to the differences in string handling (utf8 vs. non-utf8) and libraries dependencies.

So I decided that it would be interesting to add thread support to the scripts to compare how the two languages handled threading. I used the native threading support that is provided with each language, no external support or libraries were used.

Along the way I discovered some interesting things on how Ruby and Perl handle threads.

The Perl thread scheduler is pretty basic and it is easy for a busy thread to completely crowd out other threads effectively stopping them in some cases. So it is a good idea to have threads yield to each other (which is expensive), or use sleep() statements (which turns out to be cheaper). I did not check this on Ruby.

Most interesting is that Perl is able to take advantage of multiple CPUs/cores when threading while Ruby limits itself on one CPU/core. The script I am playing around with lends itself nicely to threading so I was able to get a nice performance improvement using Perl (more or less linearly with the number of threads I ran plus some overhead), while I go no performance improvement on Ruby.

It would be interesting to see how this script fares in Python, next weekend perhaps.