New online man pages for Debian

This article brought to you by LWN subscribers Subscribers to LWN.net made this article — and everything that surrounds it — possible. If you appreciate our content, please buy a subscription and make the next set of articles possible.

The announcement of a modernized version of the online Debian man pages was met with well-deserved acclaim, but also with some concerns about the development tools being used. The man pages themselves are handy and are organized by Debian release; each page provides navigation links to pages for each release and in languages other than what is set in the browser. But the use of GitHub for development and bug tracking was questioned by some.

Michael Stapelberg posted the announcement of the updated service in mid-January to the debian-devel mailing list. Instead of a CGI script, the man pages are statically generated "and therefore blazingly fast". Stapelberg implemented the debiman tool that is used to create the static web pages from the man pages in Debian packages. He also worked with the Debian system administrators to deploy the service. As described on the About page, others were instrumental in getting the service up and running as well.

The CGI-based version ran aground in August 2016 due to excessive load that was at least partly caused by traffic from robots and web spiders. Default Apache installs on Debian linked to manpages.debian.org, which apparently exacerbated the problem. That led to the creation of debiman and the new site, which can withstand much more traffic because of its static nature.

The debiman program runs regularly to pick up new packages or changes to the man pages. It takes less than ten minutes for debiman to create the whole set of pages for Debian unstable and less than fifteen seconds to do an incremental update, according to the project page. It tracks multiple Debian repositories; as Stapelberg put it:

Much like the Debian package tracker, manpages.debian.org includes packages from Debian oldstable, oldstable-backports, stable, stable-backports, testing and unstable. New manpages should make their way onto manpages.debian.org within a few hours.

The debiman page notes that the crontab(5) man page (seen at right) is a particularly good test case because it "is present in multiple Debian versions, multiple languages, multiple sections and multiple conflicting packages". The page for the "Jessie" (Debian 8.x) version has links to man pages for three different packages (cron, bcron-run, and systemd-cron), several different versions ("Wheezy", testing, unstable, and experimental), the crontab(1) man page, and the page in five different languages.

The response was generally quite positive. Various folks replied to offer congratulations and thanks. There were, of course, some bug reports and feature requests as well. Henrique de Moraes Holschuh asked that man pages from the contrib repository be added, since those are all freely licensed (unlike non-free). Stapelberg added an issue to the GitHub tracker, which was closed as fixed shortly thereafter.

Paul Wise posted a laundry list of suggestions and bugs, most of which were either already addressed or had new GitHub issues opened for them. But Debian has its own bug tracker, of course. Ian Jackson was concerned about the use of GitHub both for hosting the code and for bug tracking. He suggested creating a pseudopackage in the Debian bug tracker so that users can report bugs that way. He also would like to see a way to automatically get the source code from the debiman program:

Also, I think the exact running version of Debian services should be publicly available. And, unless this is made so easy that the service operators don't have to think about it, it will always fall behind. So I think this should be done automatically. Would you accept a patch to make debiman copy its own source code, including git history, to its output ?

Stapelberg pointed out that each of the pages has the Git revision used to generate it in the page footer ("debiman c17f615, see github.com/Debian/debiman"). That should allow anyone to check out the exact code that was run to create the page. There are some other ancillary files that need to be checked into Git, but that was already on his to-do list. He said that while he agreed with Jackson's concerns about using a proprietary service like GitHub, he was being pragmatic about it:

I prioritize "free software needs people who work on it", and by using GitHub, contributions are made significantly easier for a large number of people. In my personal experience, I can say that I would not be able to spend _nearly_ as much of my time on FOSS if it weren't for GitHub's convenient web interface.

Even though the commit ID is available on each page, Jackson said, he would "like to be able to check out the running version without interacting with a proprietary online service". To that end, Stapelberg created a mirror of the GitHub repository for debiman and a repository for the Debian-specific pieces on Debian's infrastructure. That way, anyone who wants the source, but doesn't want to deal with GitHub, can get it.

Creating a pseudopackage for the Debian bug tracker would allow users to report bugs in debiman using that system. Stapelberg is not entirely happy with that plan, but is willing to go along:

I personally find the Debian bug system very uncomfortable to use. I will begrudgingly accept reports made via the BTS, as I do for the Debian packages I maintain. I don't want to give up using GitHub's issue tracker, though, for my convenience and the convenience of our users.

The underlying problem is that there is no free alternative to GitHub that is as full-featured. Various posters in the thread noted that problem (including Jackson in his original response). The ease of use of GitHub, along with its widespread adoption, simply makes it far easier for new contributors to get involved. It is not at all surprising that Debian developers, in particular, would be disinclined toward GitHub, but alternatives need to arise. As Alec Leamas put it:

That said, the very idea with debian is about free software; free as in Open Source. And github is certainly not free. So, we should try hard to push for the free alternatives. If we had applied the github thinking "use what works, free or not" we shouldn't be where we are. But, we cannot just say "our tools are as good as github". Because they are not. We need to understand it, and see what can be done. It's an uphill battle, but also uphill battles can be won.

There was a suggestion to look at using a free version of GitLab for hosting Debian projects. The fact that there is a free version may be a sticking point, however. GitLab the company has two separate versions of its code, a community edition and closed-source enterprise edition, which is the classic "open core" model. Lars Wirzenius was not particularly interested in using GitLab because of that:

I personally don't find "open core" projects to be fully free software, even if they follow current DFSG, OSI, and FSF criteria. I may be in a minority with that view, of course.

So it would seem that things have worked out amicably at this point, though it is unlikely that the GitHub question has gone away. But there are much nicer, faster man pages online and Debian users can get the source and file bugs using Debian infrastructure. The "move" to GitHub is one we are seeing recur with some frequency (e.g. for Python) and will likely see more of in the coming years. A free alternative that eventually attracts a large following is what is needed, but nothing has really reared its head at this point.