There are dozens or even hundreds of open source licenses available out there. More generally speaking, these licenses can be grouped into two groups: permissive licenses and restrictive licenses. On the permissive group, perhaps the MIT license is one of the most common one. On the other group, GPL is one of the essential restrictive licenses.

Which one is the most commonly used in open source projects?

Although the question is easy to pose, it is not necessarily easy to answer, mainly because of the multiple threats hidden when analyzing license usage in open source projects.

First, because there is not a single coding hosting website that hosts all open source projects. Although GitHub is one of the most common alternatives, many open source projects are hosted in the community git repository, in other repositories such as the Debian package archive, and GNU Project FTP archive, etc.

Second, because to understand open source license usage in scale, we should rely on tools that infer the license used. This process is not always precise because developers often state open source licenses in different ways. For example, while some developers declare their licenses as a comment in the header of every source code file, other developers cite the license used for the whole project within the README file. Moreover, some developers copy and paste the full text of the license, whereas other developers might only mention the name of the license.

Given these limitations, we could still try to infer the open source license usage in practice by, for instance, using GitHub data stored on Libraries.io (a service that curates metadata from open source projects hosted in several package managers). Libraries.io provides data available on Zenodo. Using this data, we inferred license usage as one could see next.

Open source license usage using Libraries.io data

As one could see, permissive licenses are by far the most used one. Indeed, MIT is the most used license, appearing in more than 812k open source projects. Apache 2.0 comes next, appearing in 465k projects. BSD-3, also a permissive license, appears next, licensing 71k projects. These three licenses (MIT, Apache 2.0, and BSD-3) are used in about 70% of the overall open source license usage.

On the other spectrum of this figure, one could see that the GPL family of licenses (GPL-2, GPL-3, AGPL-3, LGPL-3) comprehends around only 5% of the overall license usage.

Why is this happening?

According to a 2016 survey with about 3,400 participants, it is estimated that 67% of the surveyed companies actively encourage developers to engage in and contribute to open source software. This shows a clear commercial interest in open source projects. However, some researchers also believe that GPL is not the most appropriate choice for a business that relies on open source.

Even though software licensed under GPL can be used (and also modified) in corporate environments, software companies should be aware of the characteristics of GPL. In particular, the key feature of GPL is that it restricts the terms of the distribution of derived works. If any software company incorporates any source code licensed under GPL, the company must license their own software products that use GPL code under GPL as well.

GPL, created by the Free Software Foundation (FSF), is the principal copyleft license. The GPL license has the ultimate goal of making software 100% free for everyone. This decision of going 100% free is actually a challenge for a business that may not always support providing 100% of their code to the public (for instance, have you ever seen the actual Google implementation of its search engine?). As a consequence, some businesses might not be comfortable using a license that is very aligned with such goals.

But, what could happen if a company misuse open source license? Many things can happen, and I approach many examples in the book I’m writing. This issue is so serious that some companies decided to literally remove all GPL-licensed code in their codebase.

For now, consider the case of Acme Inc, a startup that was urgently looking for an acquisition. Acme received a acquisition deal made by Shockwave, a bigger company in the same segment. During the inspection by the acquirer, Shockwave noticed that the Acme development team were misusing open source licenses. “Shockwave ultimately backed out of the deal and the Acme technology was put on the shelf without a financial return to their employees or investors.”