As my regular readers might remember, I finished my August assignment on the 28th of September at 3 AM. I sent the email to Neil Bowers, noting it was probably too late to get a proper September assignment. Surprisingly, Neil replied with

Well, you’re on an unbeaten run so far, so if you want a September one, with 4 days left, I’ll assign you one. Want one? :-)

I imagined 4 days (I could only count 3, but hey) with maybe another two weeks of “sticking” with the assignment, and replied with Yes.

Being Busy

The same day at 3 PM, I got my assignment: HTML::Element::Replacer. It was a public holiday here and we went to our friends’ son’s birthday party, so I didn’t have enough time to even have a look. The next day, there was Prague.pm’s emergency social meeting with people from Brno.pm. I returned home around midnight, but despite having drunk many beers, I didn’t want to go to bed. You probably already know — I started hacking.

Generated Files

The first thing I noticed after cloning my fork of the GitHub repository was the presence of the generated files in it. The Makefile was there, a file generated by tests, and even the whole blib/ directory! Don’t keep generated files in version control, I said to myself and created my first commit.

Test Failure

But, I felt like I could do better. I checked the testers’ reports and discovered failures in Perls 5.18+. The failing test compared two XML files as strings, but the order of a node’s attributes was different:

# Failed test 'HTML' # at t/01-replacer.t line 28. # got: '<table> # <tr scla="top" /="/"></tr> # <tr scla="mid"> # <td kmap="brand">schlitz</td> # <td kmap="age">young</td> # </tr> # <tr scla="mid"> # <td kmap="brand">lowenbrau</td> # <td kmap="age">24</td> # </tr> # <tr scla="mid"> # <td kmap="brand">miller</td> # <td kmap="age">17</td> # </tr> # <tr /="/" scla="bot"></tr> # </table> # ' # expected: '<table> # <tr scla="top" /="/"></tr> # <tr scla="mid"> # <td kmap="brand">schlitz</td> # <td kmap="age">young</td> # </tr> # <tr scla="mid"> # <td kmap="brand">lowenbrau</td> # <td kmap="age">24</td> # </tr> # <tr scla="mid"> # <td kmap="brand">miller</td> # <td kmap="age">17</td> # </tr> # <tr scla="bot" /="/"></tr> # </table> # '

Do you see it? No? diff can help you:

< # <tr /="/" scla="bot"></tr> --- > # <tr scla="bot" /="/"></tr>

Wait, wait… Is slash a valid attribute name in HTML at all? HTML::TreeBuilder seems to have its own problems here with parsing XML-like self-closing tags. But without the problems, each element would have had only one attribute, and a different bug wouldn’t have been revealed.

It was clear to me that the ordering of attributes is a consequence of the hash order randomisation that happened in 5.18. HTML::PrettyPrinter has no option to specify how to order attributes, so I decided to drop the dependency and use the HTML::Element::as_HTML method from the HTML::Tree distribution that was already used for other stuff.

After the second commit, I got finally tired, so I created a pull request from the commits (I should have probably created two separate PR’s, but it was too late). Two days after the assignment, I was ready for October!

Afterthought

I usually use XML::XSH2 for XML handling. Here’s how you can mimic 01-replacer.t in it: