Date Sat, 8 Mar 2003 16:05:14 -0800 From Larry McVoy <> Subject Re: BitBucket: GPL-ed KitBeeper clone > > Give it up. BitKeeper is simply superior to CVS/SVN, and will stay that

> > way indefinitely since most people don't seem to even understand _why_

> > it is superior.

>

> You make it sound like no one is even interested ;-). But it's not true! A

> lot of people currently working on alternative version control systems would

> like very much to know what it would take to satisfy the needs of kernel

> development. Maybe, being on the inside of the process and well aware of

> your own needs, you don't realize how difficult it is to figure these things

> out from the outside. I think only very few people (perhaps only one) really

> understand this issue, and they aren't communicating with the horde of people

> who really want to help, if only they knew how.



[Long rant, summary: it's harder than you think, read on for the details]



There are parts of BitKeeper which required multiple years of thought by

people a lot smarter than me. You guys are under the mistaken impression

that BitKeeper is my doing; it's not. There are a lot of people who

work here and they have some amazing brains. To create something like

BK is actually more difficult than creating a kernel.



To understand why, think of BK as a distributed, replicated, version

controlled user level file system with no limits on any of the file system

events which may happened in parallel. Now put the changes back together,

correctly, no matter how much parallelism there has been. Pavel hasn't

understood anything but a tiny fraction of the problem space yet, he

just doesn't realize it. Even Linus doesn't know how BitKeeper works,

we haven't told him and I can tell from his explanations that he gets

part of it but not most of it. That's not a slam on Linus or Pavel or

anyone else. I'm just trying to tell you guys that this stuff is a lot

harder than you think. I've told people that before, like the SVN and

OpenCM guys, and the leaders of both those efforts showed up later and

said "yup, you're right, it is a hell of a lot harder than it looks".

And they are nowhere near being able to do what BK does. Ask them if

you have doubts about what I am saying.



Merging is just one of the complex areas. It gets all the attention

because it is hard enough but easy enough that people like to work on it.

It's actually fun to work on merging. Ditto for the graph structure,

that's trivial. The other parts aren't fun and they are more difficult

so they don't get talked about. But they are more important because

the user has no idea how to deal with them and users do know how to deal

with merge problems, lots of you understand patch rejects.



Rename handling in a distributed system is actually much harder than

getting the merging done. It doesn't seem like it is, but we've rewritten

how we do it 3 times and are working on a 4th all because we've been

forced to learn about all the different ways that people move things

around. CVS doesn't have any of the rename problems because it doesn't

do them, and SVN doesn't have 1/1000th of the problems we do because it

is centralized. Centralized means that there is never any confusion

about where something should go, you can only create one file in one

directory entry because there is only one directory entry available.

In BK's case, there can be an infinite number of different files which

all want to be src/foo.c.



Symbolic tags are really hard. What?!? What could be easier than adding

a symbolic label on a revision? Well, in a centralized system it is

trivial but in a distributed system you have to handle the fact that

the same symbol can be put on multiple revs. It's the same problem as

the file names, just a variation. Add to that the fact that time can

march forward or backwards in a distributed system, even if all the

events were marching forward, and the fun really starts. I personally

have redone the tags support about 6 times and it still isn't right.



Security semantics are hard in a distributed system. Where do you

put them, how do you integrate them into the system, what happens when

people try and work around them? In CVS or SVN you can simply lock down

the server and not worry about it, but in BK, the user has the revision

history and they are root, they can do whatever they want.



Time semantics are the hardest of all. You simply can't depend on time

being correct. It goes forwards, backwards, and sideways on you and

if you think you can use time you don't have the slightest idea of the

scope of the problem. Again, not a problem for CVS/SVN/whatever, all the

deltas are made against the same clock. Not true in a distributed system.



That's a taste of what it is like. You have to get all of those right

and the many other ones that I didn't tell you about or you might as

well not bother. Why? Because the problems are very subtle and there

isn't any hope of getting an end user to figure out a subtle problem,

they don't have the time or the inclination. We've seen users throw away

weeks of work just because they didn't understand the merge conflict so

they start over on an updated tree. And those people will understand

the rename corner cases? Not a chance.



The main point here is that if you think that BK happened quickly,

by one guy, you are nuts. It started in May of 1997, that's almost 6

years ago, not the 2 years that Pavel thinks, and I had already written

a complete version control system prior to that, so this was round two.

Even with that knowledge, I wasn't near enough to get BK to where it is

today, there is more than 40 man years of effort in BK so far. A bunch

of people, working 60-90 hour weeks, for almost 6 years. Not average

people, either, any one of these people would be a staff engineer or

better at Sun (salaries for those people are in the $115K - $140K range).



The disbelievers think that I'm out here waving the "it's too hard"

flag so you'll go away. And the arrogant people think that they are

smarter than us and can do it quicker. I doubt it but by all means go

for it and see what you can do. Just file away a copy of this and let

me know what you think three or four years from now.



Oh, by the way, you'll need a business model, I found that out 2 or 3

years into it when my savings ran out. Oh, my, you might not be able

to GPL it! Why it might even end up being just like BitKeeper with

an evil corporate dude named Pavel running the show. Believe me, if

that happens, I'll be here to rake him over the coals on a daily basis

for being such an evil person who doesn't understand the point of free

software. I can't wait.

--

---

Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

-

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

the body of a message to majordomo@vger.kernel.org

More majordomo info at http://vger.kernel.org/majordomo-info.html

Please read the FAQ at http://www.tux.org/lkml/



