David Wheeler is a long-time leader in advising and working with the U.S. government on issues related to open source software. His personal webpage is a frequently cited source on open standards, open source software, and computer security. David is leading a new project, the CII Best Practices Badging project, which is part of the Linux Foundation's Core Infrastructure Initiative (CII) for strengthening the security of open source software. In this interview he talks about what it means for both government and other users.

Let's start with some basics. What is the badging project and how did it come about?

In 2014, the Heartbleed vulnerability was found in the OpenSSL cryptographic library. In response, the Linux Foundation created the Core Infrastructure Initiative (CII) to fund and support open source software (OSS) that are critical elements of the global information infrastructure. The CII has identified and funded specific important projects, but it cannot fund all OSS projects. So the CII is also funding some approaches to generally improve the security of OSS.

The latest CII project, which focuses on improving security in general, is the best practices badge project. I'm the technical lead of this project. We think that OSS projects that follow best practices are more likely to be healthy and to produce better software, including having better security. But that requires that people know what those best practices are, and whether or not a given project is following them.

To solve this, we first examined a lot of literature and OSS projects to identify a set of widely accepted OSS best practices. We then developed a website where OSS projects can report whether or not they apply those best practices. In some cases we can examine the website of a project and automatically fill in some information. We use that information (automatic and not) to determine whether the project is adequately following best practices. If the project is adequately following best practices, the project gets a "passing" badge.

I should note here that the badging project is itself an OSS project. You can see its project site. We'd love to have even more participation, so please join us! To answer the obvious question, yes, it does earn its own badge. Other OSS projects that have earned a badge include the Linux kernel, curl, Node.js, GitLab, OpenBlox, OpenSSL, and Zephyr. In contrast, OpenSSL before Heartbleed fails many criteria, which suggests that the criteria are capturing important factors about projects.

If you participate in an OSS project, I encourage you to get a badge. If you're thinking about using an OSS project, I encourage you to see if that project has a badge. You can see more information at CII Best Practices Badge Program.

Can you tell me more about the badge criteria?

Sure. Of course, no set of practices can guarantee that software will never have defects or vulnerabilities. But some practices encourage good development, which in turn increases the likelihood of good quality, and that's what we focused on.

To create the criteria we reviewed many documents about what FLOSS projects should do; the single most influential source, was probably Karl Fogel's book Producing Open Source Software. We also looked at a number of existing successful projects. We also got very helpful critiques of the draft criteria from a large number of people including Karl Fogel, Greg Kroah-Hartman (the Linux kernel), Rich Salz (OpenSSL), and Daniel Stenberg (curl).

Once the initial criteria were identified, they were grouped into the following categories: basics, change control, reporting, quality, security, and analysis. We ended up with 66 criteria. Each criterion has an identifier and text. For example, criterion floss_license requires that, "the software MUST be released as FLOSS." Criterion know_common_errors requires that, "At least one of the primary developers MUST know of common kinds of errors that lead to vulnerabilities in this kind of software, as well as at least one method to counter or mitigate each of them." The criteria are available online, and an LWN.net article discusses the criteria in more detail. It only takes about an hour for a typical OSS project to fill in the form, in part because we automatically fill in some information. More help in automating this would be appreciated.

The criteria will change slowly, probably annually, as the badging project gets more feedback and the set of best practices in use changes. We also intend to add higher badge levels beyond the current "passing" level, tentatively named the "gold" and "platinum" levels. However, the project team decided to create the criteria in stages; we're more likely to create good criteria once we have experience with the current set. A list of some proposed higher-level criteria is already posted; if you have thoughts, please post an issue or join the badging project mailing list.

In digging into this area, what surprised you the most in your findings?

Perhaps most surprising is that many OSS projects do not describe how to report security vulnerabilities. Many do, of course. Many projects, like Mozilla Firefox, describe how to report vulnerabilities to a dedicated email address and provide a PGP key for sending encrypted email. Some projects, like Cygwin, have a clear policy that all defect reports must be publicly reported (in their case via a mailing list), and they expressly forbid reporting via private email. I'm not a fan of "everything public" policies (because they can give attackers a head start), but when everyone knows the rules it's easier to handle vulnerabilities. However, many projects don't tell people how to report vulnerabilities at all. That leads to a lot of questions. Should a security researcher report a vulnerability using a public issue tracker or mailing list, or do something else? If the project wants to receive private reports using encryption, how should the information be encrypted? What keys should be used? Vulnerability reporting and handling can be slowed down if a project hasn't thought about it ahead of time.

There are several reasons to be surprised about this. First, it should be surprising that developers don't anticipate that there are vulnerabilities in their code and that they should be prepared to address them. The news is filled with vulnerability reports! Second, it's surprisingly easy to solve this problem ahead of time; just explain on a project website how to report vulnerabilities. It only takes one to three sentences! Of course, writing that down requires project members to think about how they will handle vulnerabilities, and that's valuable. It's much better to decide how to handle vulnerabilities before they are reported.

What didn't surprise you, but might surprise others?

Some things that might surprise you are:

1. There are still projects (mostly older ones) that don't support a public version-controlled repository (e.g., using Git), making it difficult for others to track changes or collaborate.

2. There are still projects (mostly older ones) that don't have a (useful) automated build or automated test suite. This makes it much harder to make improvements, and much easier for defects to slip through undetected. It also makes it hard to upgrade dependencies when a vulnerability is found in them, because there's no easy to way to verify that the change is almost certainly harmless.

3. There are many projects (mostly newer ones) that many people think are OSS, but they aren't legal to use because they have no license at all. Software without a license or other marking isn't normally legal to use, distribute, or modify in most countries. This is especially a problem on GitHub. Ben Balter's 2015 presentation Open source licensing by the numbers found that on GitHub, 23% of the projects with 1,000 or more stars had no license at all. It's true that before 1976, works in the U.S. without affixed copyright markings were not restricted by copyright, but that was a long time ago. Today, law in almost all countries requires that developers license software before it can be legally used by others, and that is unlikely to change. Even if you think the law doesn't apply to you, omitting a FLOSS license tends to inhibit project use, co-development, and review (including security reviews).

4. Most developers still don't know the basic principles of developing secure software, or what the common mistakes that lead to vulnerabilities are.

We're hoping that the badging program will help counter problems like these.

For governments struggling with the use of open source software, are there lessons from this initiative?

There are two kinds of large organizations: those who know they're using OSS, and those who don't know they're using OSS. There is no third kind. If you pay for proprietary software licenses, if you look inside, you'll find a lot of OSS in almost all proprietary software. Custom software is usually built on top of, and by, OSS.

Many managers struggle because they don't understand the change that has already happened in the computer industry, namely, that OSS is a major force in computing and here to stay. Instead of embracing change, there are still managers who want to pretend that the change hasn't happened, or are trying to return to some mythical "good old days" that really weren't that good. For those trying to prevent all use of OSS, I recommend a simple tonic: Stop trying to turn back the clock, and instead, embrace the change that's already happened.

A different struggle is people who partly accept this reality, but want to manage risk and don't know how. The goal is right, but their approaches are often misguided. I repeatedly hear "but you don't know who developed the OSS," for example. In most cases this is false, since who wrote each line is often a matter of public record (it's included in the version control system data). Absurdly, these same people are happy to take proprietary software even though they do not have any idea who wrote the code. A related problem is that many people want to use a company's location registration as the sole risk measure for malicious code, and that's a mistake. For example, a company may be registered in the U.S., but that doesn't mean that its developers and testers are in the U.S. or are U.S. citizens. Company registration is like ship registration; it's a legal fiction unrelated to reality. Besides, you can be a citizen of a country and still create malicious code to attack that country's government. Simple tests like "is this from a company in my country" aren't very helpful for managing risk; we need better ways.

For people who are looking at bringing in OSS, but also want to manage their risk, this initiative should definitely provide some help. We instead focus on various best practices that are more likely to produce good software, instead of focusing on legal fictions like country registration. Potential users can see, through the badging project, the best practices and how well a particular project is doing.

Of course, the best practices badging project can't solve all problems. We can't tell you if some given OSS will solve the problem you have; only you can understand your needs. There are many different OSS support models; in some cases a simple download and depending on community is a good approach, while in others, you really should be using a company that provides professional support. We're working to provide some of the information people need to make better decisions.

On a different note, when we last spoke, you indicated that the risk of the U.S. government forking projects was still a big problem. What's your take on the state of this today?

First, a quick clarification. I use the term "fork" to mean any copying of a project with the intent to modify it. Forking is great if you plan to submit the modification back to the main project. The problem is a "project fork," which is a fork with the intent to create a new independent project. Project forks inhibit collaboration, instead of encouraging it, and that's why project forks are a problem.

The U.S. government doesn't track project forks it funds, so I can't answer that question with confidence. Anecdotally, I think the problem has slightly reduced, but it's still a big problem. One thing that's helped reduce project forking in government-funded projects is the increasing use of public repositories, particularly on GitHub. Public repositories, and the expectation that one will be used, greatly increase transparency. If a company knows that it will be compelled to make its project fork publicly visible, it will correctly worry that it might be asked to defend the indefensible. To paraphrase Justice Louis Dembitz Brandeis, transparency is often the best disinfectant. In addition, if the government forces a company to make their project fork available as OSS, some of the economic incentives for project forking disappear, and its improvements might be merged back into the main project (a process called "healing" a fork).

That said, there are still too many perverse incentives in the government. An obvious one example is that the contractors get significantly higher profits if they develop software from scratch instead of reusing existing software. Many projects reinvent the wheel, because they're paid to do so, and taxpayers pick up the bill.

Even more seriously, suppliers know that they can extort orders of magnitude more money if they can lock their customers into their products. If there's no way to escape the product or service, the supplier can just keep raising prices without restraint, and the supplier has no incentive to provide new technology or good support. The government's entire acquisition system presumes that there's open competition for work and future work, and in theory that provides a countervailing force. In practice, it often doesn't.

In computing, data rights (aka intellectual rights or technical data rights) are a synonym for power. You can't change software, or have a competitive bid for an alternative future maintainer, unless you have the data rights , even if you originally paid for the development of that software. Yet many government employees don't understand that. Contractors do understand that, and are incentivized to exploit the naïve government employees who don't. Data rights are arguably much more important than requirements, because requirements constantly change, while data rights determine who can modify the software to respond to the new requirements. Contractors will ask for exclusive rights to software even though they didn't pay for most of its development, or use tricks like creating proprietary "communication buses" whose primary purpose is to lock the government into that contractor. If a contractor gets exclusive rights, then the government loses forever its ability to have effective competition for its maintenance. Similarly, third-party software suppliers often have proprietary interfaces, or create proprietary extensions to standard interfaces, to inhibit users from changing to a competing supplier. That's not to say that all contractors or third-party suppliers are evil, or that all government employees are clueless; far from it. The government needs these people! But perverse incentives, combined with naiveté on the government side, often leads to situations in which the government is locked into a contractor or third-party supplier, who can then charge arbitrary prices for poor results. My guess is that if the government could actually have full and open competition on software, it would save at least tens of billions of dollars a year. The exact number matters less than the problem; we clearly have a problem that needs fixing.

I think a big help to this would be to require software developed using government funding to be released to the public as open source software, unless there's a compelling need to do otherwise. That would greatly reduce the incentive to recreate everything from scratch, or trying to lock the government into a single contractor. We might also want to impose financial penalties on those who fail to do market research to find alternatives; this is required by law and contract, but often not done. I'm sure there are other things that should be done; the key is that the current system has many problems.

What are other things that people involved with the U.S. government should know?

As I mentioned last time, one of the biggest problems is that most acquisition personnel don't understand that practically all OSS is commercial software. Under U.S. law, the Federal Acquisition Regulation (FAR), and the DoD FAR Supplement (DFARS), software is commercial if it has a non-government use and is licensed to the public. In particular, see U.S. Code Title 41 Section 103 which defines the term "commercial item."

Because nearly all OSS meets the definition for commercial software, nearly all OSS is commercial software. Once you understand this, you understand that to use OSS you just follow the rules for commercial software, and that the U.S. law's preference for commercial items (10 USC 2377) includes OSS. This understanding makes a lot of decisions very straightforward, but it's still not widely understood.

David A. Wheeler works for the Institute for Defense Analyses (IDA), but in this interview he is not speaking for IDA, the U.S. government, or the Linux Foundation.