How Tridge reverse engineered BitKeeper

Andrew Tridgell delivered the first linux.conf.au keynote on Thursday morning. The bulk of the talk covered software engineering techniques and how the free software community is taking a leading role in adopting those techniques. It was a good talk, and your editor will attempt to write it up later on.

At the end, however, Tridge touched on his role in the separation of the kernel project and BitKeeper. He couldn't talk about much, and he did not announce the release of his BitKeeper client. But he noted that there has been quite a bit of confusion and misinformation regarding what he actually did. It was not, he says, an act of wizardly reverse engineering. Getting a handle on the BitKeeper network protocol turned out to be rather easier than that.

He started by noting that a BitKeeper repository has an identifier like bk://thunk.org:5000/ . So, he asked, what happens if you connect to the BitKeeper server port using telnet? A quick demonstration sufficed:

telnet thunk.org 5000 Trying 69.25.196.29... Connected to thunk.org. Escape character is '^]'.

Once connected, why not type a command at it?

help ? - print this help abort - abort resolve check - check repository clone - clone the current repository help - print this help httpget - http get command [...]

Tridge noted that this sort of output made the "reverse engineering" process rather easier. What, he wondered, was the help command there for? Did the BitKeeper client occasionally get confused and have to ask for guidance?

Anyway, given that output, Tridge concluded that perhaps the clone command could be utilized to obtain a clone of a repository. Sure enough, it returned a large volume of output. Even better, that output was a simple series of SCCS files. At that point, the "reverse engineering" task is essentially complete. There was not a whole lot to it.

Now we know about the work which brought about an end to the BitKeeper era.

