This year I had a information retrieval course. I’m not a fanboy of Haskell or functional programming, and when the time came to write a program given short time I decided I’ll do better with Python. I know the language pretty well and did some text processing in it before.

There were some significant advantages of choosing Python. For one thing, it handles strings pretty well: there are good (read: written in C) procedures that are well designed and suitable for many types of processing. Well… turns out not that many. Pretty quickly I had to resort to using regular expressions, which may be fast, but felt like an overkill.

In theory I knew all the pieces that would compose into my program. The problem of putting them together shouldn’t be too hard – and it wasn’t. The problem was to get this thing working on my machine with workload intended. Enter garbage collector nightmare. Turns out Python (or at least CPython, I didn’t try the others) has a serious problems with it. A program that should do well with constant memory after while can start swapping like mad – and in practice, never finish the job. I’ve lost countless hours trying to pin down the bugs in my code that led to this. I still wonder where they are – and whether they are really fell on my part. I doubt it know. I wrote a lot of code that should solve the problem many times, if there was a problem in my code.

I ended up writing all the code in the way they I could do the batch processing. Then I had to write additional code to merge the results. Lots and lots of boilerplate code (with it’s own share of bugs to smash). I was pretty frustrated and out of schedule. The final product was below my expectations, in terms of both functionalities and code quality, which surely found it way into the final score for the course.

Now you can tell me: it’s always the same story: too little time, too much to do, hidden problems I didn’t foresee.

So I’ll tell you about my experience with Haskell.

Some time later I had to expand my previous task. While I had most of the code in place, it was a complete mess. I designed it to be one, 500-lines at maximum script. I ended up with 1200 lines of code.

So I decided to start from scratch with Haskell. It was refreshing: GC worked well, Binary works much better then Python’s Pickle (yes, I did use the C version…), and in overall I could easily reuse most of the code. All in all, the resulting code worked much better and had fewer bugs. I also spend half of the time, and didn’t have any unpleasant discoveries.

Haskell has it’s own set of pitfalls and shortcomings. But then, every useful programming language has them. The real question is: where are the limits of what can easily be done with it? I haven’t answered this question for Haskell – yet. But clearly Haskell allows you more than Python.

Share this: Print

Email

Reddit

Facebook

Twitter

Like this: Like Loading... Related

Posted in haskell