Manual syntax conversion Fixing bugs caused by conversion errors Converting code to idiomatic target language

gtkdoc-mkdb

Post scriptum

There are tools that automatically convert from one language to another. They can help, but only in step one. Steps two and three are still there, and could take more work than manually converted code because usually manual conversion produces more human-understandable code. Sadly Turing-completeness says that we can't have nice things.

Whenever a new programming language becomes popular its fanpersons start evangelizing its virtues by going to existing projects and filing bug reports that look like this.When put like that this seems like a no-brainer. Being better atis good so surely everyone should just port their projects toRecently there has been movement to convert tooling used by various software projects in the Gnome stack from a mishmash of shell, Awk and Perl into Python 3. The main reasoning for this is that having only one "scripting" dependency to a modern, well maintained project makes it simple to compile applications using Gnome technologies on platforms such as Windows. Moving between projects also becomes easier.One tool undergoing this transition is GTK-doc . which is a documentation generation tool written mostly in Perl. I have been working together with upstream to convert it to Python 3. This has been an educational experience in many ways. One of the first things learned is that converting between any two languages usually breaks down to three distinct phases.A Perl to Python conversion is relatively straightforward in the case of gtk-doc. It mostly deals with regular expressions, arrays and dictionaries. All three of these behave pretty much the same in both languages so step one is mostly manual work. Step two consists of fixing all the bugs and behavioural changes introduced in phase one (many caused by typos and lapses of concentration during step one). This phase is basically debugging. The third step is then a question of converting regular expressions and global variables into objects and other sane and readable constructs.When doing the conversion I have been mostly focusing on step one, while the gtk-doc maintainer has agreed to finalize the work with steps two and three. While converting the 6000+ lines file, I did some measurements and it turns out that I could do the conversion at a peak rate of 500 lines an hour, meaning roughly 7 seconds per line of code.This was achieved only on code that was easy to convert and was basically an exercise in Emacs fingering. Every now and then the code used "fancy" Perl features. Converting those parts was 10x, 100x, and sometimes up to 1000x slower. If the port had required architectural rework (which might happen when converting a lax language to one that has a lifetime checker in the compiler, for example) it would have slowed things down even more.I don't know how much work steps 2 and 3 entail, but based on comments posted on certain IRC channels, it is probably quite a lot. Let's be generous and say overall these three items come to 250 lines of converted code per hour.Now comes the truly sad part. This speed of conversion is not sustainable. Manually converting code from one format to another is the most boring, draining and soul-crushing work you can imagine. I could only do this for a maximum of a few hours per day and then I had to stop because all I could see was a flurry of dollar signs, semicolons and curly braces. Based on this we can estimate that a sustained rate of conversion one person can maintain is around 100 lines of code per hour (it is unlikely that this speed can be maintained if the project goes on for weeks but since there are no measurements let's ignore it for now).The cURL project consists of roughly 100 thousand lines of C code according to Ohloh. If we assume that converting it to a some other language is just as easy as converting simple Perl to Python (which seems unlikely), the conversion would take 1000 person hours. At 8 hours per day that comes to around 5 months of full time work. Once that is done you get to port all the changes made in trunk since starting the conversion. Halting the entire project while converting it from one language to another is not an option.This gives us a clear answer on why people don't just convert their projects from one language to another: