

Subject: [ANN] hpricot 0.7

From: _why <why@ b l g r

Date: Wed, 18 Mar 2009 03:08:39 +0900

Please enjoy a succulent, new Hpricot. A bit faster, some Ruby 1.9 support, and assorted fixes. gem install hpricot --source http://code.whytheluckystiff.net It should show up at Rubyforge in a bit. I'm sure you're wondering what's the reason for Hpricot updates, in the face of heated competition from the Nokogiri and LibXML libraries. Remember that Hpricot has no dependencies and is smaller than either of those libs. Hpricot uses its own Ragel-based parser, so you have the freedom to hack the parser itself, the code is dwarven by comparison. Best of all, Hpricot has run on JRuby in the past. And I am in the process of merging some IronRuby code[1] and porting 0.7 to JRuby. This means your code will run on a variety of Ruby platforms without alteration. That alone makes it worthwhile, wouldn't you agree? Clearly, the benchmarks you see on Ruby Inside are skewed to favor Nokogiri. They parse XML through Hpricot without using Hpricot.XML(), which is not only wrong, but puts XML through needless HTML cleanup operations. I am sure that Hpricot 0.7 still fares slower on large documents. However, for instance, try testing a large amount of small documents (a much more common scenario) with this latest version. You have to question a benchmark that is entirely based on two XML documents. What about HTML fix ups? What about various platforms and CPUs? Why not treat Hpricot fairly and use it properly in the benchmarks? It reeks of something. _why [1] http://github.com/nrk/ironruby-hpricot/tree/master