By fred on Saturday, July 24 2010, 17:15

So I'm finally done with my Master's Project in Quantum Computing at Århus. My final report on The Hidden Subgroup Problem is now available in XHTML and pdf formats. There are also the slides I've made for the oral defense, although as always it is not really useful without the comments. This time, I've used the W3C's slidy tool and so my work is 100% based on Web formats :-)

The Hidden Subgroup Problem still remains difficult with our current knowledge, but I've very appreciated thinking on this challenging topic and I hope my work will help research in this area. As an advocate of the Web as a publishing medium for scientists, I've also found a meta-interest in seeing whether it is possible to use Web formats to produce documents readable both on the screen and on the paper (and in this latter case with the same quality as the more traditional methods). One way to do that is to use TeX-like documents and export them in Web/Printing formats using tools such that Tex4ht or GELLMU. Since my personal preference is to write Web contents by a hybrid method (mixing WYSIWYG and simple-syntax-parsers), I've rather considered how one can print these Web formats. One tool to do that is Prince but I'm not sure it is that good with SVG/MathML (there were some issues last time I tried it). Hence I've preferred to use a free layout engine and thus printed my report with Gecko :-).

One of the difficulty is to handle numbering (pages, sections, formulas, theorems...) and the corresponding references (HTML links or page numbers) either in the table of contents or inside the body of the report itself. Moreover, the report has to be split into several small pieces for otherwise it is too large to be edited easily (in my case, 1.1Mb on a single XHTML page for an amount of 99 pages). Hence at the beginning I had to quickly write once and for all an automated system. I'm not really prude of it - it is truly an usine à gaz involving CSS counters, Flex lexical analyzer and a command line print extension for Firefox - but at least it works almost as I expect. Ideally, it should be possible to print each piece separately before grouping them into one single pdf document. However, because of the choice of CSS counters and the lack of control on page numbering in Mozilla I could only print the whole large XHTML document. This was really not convenient and was one of the main annoying issue. The other one being that there is no access to the page numbers and so for the table of contents, I had to write them manually :-(. It seems that a way to overcome the page number issues would be to implement some CSS rules for printing although I don't know if it helps working on separate pieces.

The other problems are various bugs in Gecko. Thanks to the recent improvements in MathML (related to stretchy and fonts) , the mathematical formulas are now displayed with a good quality, or at least one of which I'm satisfied. One issue is the incorrect computation of dimension in mtable which is slightly visible for some split equations. I've also discovered a wrong thickness of bars which seems to be the only difference between print and screen renderings. However, I could workaround it and Karl gave me a hint that would hopefully allow to fix the bug. There are still some annoying bugs with linebreaking within mathematical expressions and around them. For the former I can avoid the problem by adding <mrow/> 's but the latter makes particularly weird effects. Typically, when some comma or period is placed just after an inline formula, a linebreak may happen and move the punctuation symbol to the beginning of the next line... Regarding quantum circuits and other schemas, the SVG code I use is very elementary. Hence I don't have any complaint to formulate, even if I'm among those who are expecting the possibility to use SVG images in the standard <img/> tag. Finally, I have some mysterious bugs with HTML tables printed on several pages and more issues with CSS page breaking, requiring me to do some small manual tweaking at two or three places.

As a conclusion, using Web formats to print a scientific report is not yet so easy. However, I'm quite satisfied of the result and I expect the issues above to be fixed in the near future...