I was honored to give the opening keynote of the first Linux Plumbers Conference this year in Portland, Oregon. Here's the slides and text of my talk (well, the text is what I intended to say, the actual words that came out probably sounded a bit different.)

I'll comment later on a few things that I've noticed people bringing up, but I figured it would be good to get the text and slides out for everyone to be able to see first.

The talk was recorded, and I'll provide a link to it when it is available so you can compare it to what I have below.

If you want to link directly to this talk, please use this link.

Update: I've responded to some of the response about this talk here.

Update 2: The video has now been published on Google Video.

The Linux Ecosystem, what it is and where do you fit in it?

A few months ago I gave a talk at Google about the Linux kernel development process. During that talk, someone asked me about Canonical's kernel contributions as they did not show up on the list that I was showing.

I offhandedly remarked that they did not show up as they had only contributed 5-6 patches in the past few years. Now this comment didn't go over very well with the Ubuntu developers, and they called me out on it as they felt it was wrong.

They were right, I was wrong, so here is my public apology.

In the past 3 years, from the 2.6.15 kernel to 2.6.27-rc6, Canonical has had 100 patches in the Linux kernel.

I appologize about my previous statement and would like the world to know the correct number here.

But as the Canonical employees seemed so eager for me to get the number correct, let's look a bit closer at it. What does 100 patches really mean?

From the 2.6.15 kernel release to the present, there have been 99324 patches made to the Linux kernel.

So, to place Canonical's contribution into perspective, that means they did 00.10068% of all of the kernel development for the past 3 years.

They are ranked 79th of all companies doing kernel development, with such prominate notable Linux supporters like nVidia just barely beating them out.

If Canonical was an individual contributor to the kernel, it would be in 195th place.

Their individual contributors end up placing in the following locations based on their number of contributions 251, 714, 1103, 1327, 1691, 1691, 2171, 2171, 2171, 2171.

Now to be fair, this is only basing things on quantity, not quality, so those 100 patches might be major contributions to the kernel, advancing the state of the art and fixing major bugs that affect thousands of people. I'll let all of you make that call.

And finally, lest anyone think I'm picking on Canonical for some reason, here's how they rank within all of the different Linux distros.

Hm, wait, I forgot one non-profit distro that I like a lot, Gentoo, let's add them into the list:

I tried tracking Debian kernel contributions, but as most of the Debian developers don't use a debian.org email address, it is hard, but I guessed and looked at who they list as their kernel team, combined with a few email addresses that do use a debian.org address and came up with the following best guess which is probably still not properly counting everything the Debian developers do:

Ok, that should set the record straight for how many patches Canonical has allowed their engineers to contribute back to the kernel community.

So, back to the Linux Ecosystem.

Wait, what do we mean here by "Linux"?

When we first discussed having a Linux conference composed of kernel developers and the developers of the surrounding "base system" of programs, we had to come up with a name for all of that. Someone proposed the term "Plumbing", so we named the conference, the "Linux Plumbers Conference".

Here's a diagram of what I consider to mean the basic description of a "Linux" system:

I've left some things out here, scripting languages that we all use to boot strap some of these programs when building them, or running startup scripts, but these programs make up the core of what a "Linux" system is. The majority of them only run on Linux systems, while a few, gcc, binutils, make, run on all operating systems, and also make up the base of the BSDs and even openSolaris.

Let's look at the size of these different programs, based on lines of code as measured by SLOCCOUNT from David Wheeler:

The largest is the kernel, making up 40% overall. That's followed by gcc, and then X11. Then binutils, glibc, ALSA, and then man-pages (we can't forget documentation!).

So who is sponsering this work?

Note, I now switched to drawing on a whiteboard, so there are no "slides" for this section, sorry, I'll try to get a copy of the end result of the drawing in here soon.

So back to the ecosystem, what does it look like...

Let's start with the developer.

They create code in patch form and contribute it to the project they are working on.

Every so often the project makes a release and there are a few users that run from and help test that individual release out. Not that many, not as many as the developers would probably like, but there are some.

Now there are lots of these individual projects as we've just seen, and every so often a distro comes along and bundles all of these individual projects up into one unified piece, and makes a release.

There are two types of releases these days, "enterprise" and "normal".

The "enterprise" release lasts a long time. These packages that were taken is a snapshot in time, that is slowly maintained and updated with bug and security fixes by the developers working for the distros. Those changes and fixes flow back into the original projects for inclusion in their main releases when possible, but for the most part, the majority of changes that go into these releases come from newer releases from the projects, so changes flow predominatly one way into the distro.

The developers working on these distros usually don't have time for anything else, they are tasked with maintaining these distros and are burried under packaging and support tasks.

Then there are the community based distros, gentoo, debian, fedora, mandriva, slackware, and opensuse. These move much quicker, pulling in the packages at a 6-8 month cycle and integrating things together. Usually fixes flow much quicker back into the upstream packages as the developers doing this kind of work know that they need to push things upstream otherwise they will just be seeing the same problems and issues in another 6 months.

Then there are the distros that base themselves off of other distros, like ubuntu and centos. These distros have yet another layer between them and the original developers. Patches rarely, if ever flow backwards into an upstream distro, and the developers are very unlikely to push their changes into the upstream packages as they don't feel the need, or don't realize the issues involved as they rely on the upstream distro so tightly.

So what does this mean for you?

If you are working for a company that allows you to work directly upstream on the projects that you want to, great, you're happy, the companies are happy you are doing great work, and there is no problem.

What happens is if you are out here, working for a company that doesn't allow you to contribute upstream due to internal company issues, or just a lack of time allowed due to other things. That can be a problem for a developer who wants to contribute upstream, but can't.

The solution, quit and go work for one of the companies that allow you to do this!

Seriously, I mean it. Right now there is a very huge scarcity in developers who know how to do this kind of work. Lots and lots of companies are trying madly to train their internal engineers on how to work with the community, in this kind of very different development model. They are looking for and need people that already know how to do this kind of thing. So don't sell yourself short at all, your skills are wanted and needed, so either tell your current employeer that you need to have the time to do this kind of upstream work, or change to a company that knows this already.

In conclusion:

posted Wed, 17 Sep 2008 in [/linux]