This week The Linux Foundation announced the new Community Data License Agreement licenses. I have not yet read the licenses themselves, as I’m still digging out from attending yet another amazing All Things Open, but even without knowing the contents of the licenses I have some questions and concerns.

Jim Zemlin, Executive Director of The Linux Foundation, is quoted in the CDLA press release as saying:

“An open data license is essential for the frictionless sharing of the data that powers both critical technologies and societal benefits…”

Well stated! Open licences for software, data, and other creative works are more important than ever for powering the innovation and invention that move us forward as an expanding civil society. They enable collaboration, without which we’d each be stuck in our own silos, continually reinventing the wheel.

What confuses me, however, is why The Linux Foundation decided to go its own direction rather than support and enhance the well established and understood Creative Commons licenses. There is no mention in the press release that other open data license options already exist, let alone whether they may have some shortcomings that the CDLA is intended to address. But those licenses do exist already, and their use is well documented and accepted by institutions worldwide. Does The Linux Foundation object to these licenses in some way that they created their own? Why fracture the data licensing space in this way?

In fact, Andy Updegrove, in a blog post welcoming CDLA into the world, stated that part of the motivation for its creation was specifically to avoid the license proliferation that plagued the early years of free and open source software and that was (and is) admirably curtailed by Open Source Initiative, its Open Source Definition, and its list of OSI-Approved licenses. By creating their own open data licenses instead of throwing their considerable weight behind the existing Creative Commons licenses, isn’t The Linux Foundation contributing to the license proliferation Updegrove says they’re trying to avoid?

That’s one concern I have with this announcement. The other is related to the source and warden of this new initiative.

The Linux Foundation is registered as a 501(c)(6) non-profit organisation. Colloquially known as a c6, these non-profit organisations are formed to promote shared business interests and are beholden to their members (typically businesses or business people themselves) to do just that. In the case of The Linux Foundation, it was originally formed to promote Linux in the business and enterprise space. It has done that brilliantly, and Linux is now the default operating system in servers, automotive, embedded devices, and just about anything else you can think of (save desktops, where it still has a long way to go).

The CDLA, however, claims it is not solely for businesses:

The CDLA licenses are designed to help governments, academic institutions, businesses and other organizations open up and share data, with the goal of creating communities that curate and share data openly.

While I wholeheartedly support the mission of helping governments, academic institutions, and other organisations open up and share data, I believe these institutions would be better served were the initiative brought by an organisation that is accountable to more than simply its members and their business interests. The Linux Foundation undoubtedly has the best intentions here. I do not question that. I do, however, question having the open data of governments and academic institutions being promoted by a group that by definition must dedicate itself to business interests. While undoubtedly The Linux Foundation would not intend to do so, as a c6 there is a risk that the business interests they are pledged to promote might influence the evolution and application of the CDLA licenses.

The optics of the CDLA initiative would be greatly improved were it instead managed by a 501(c)(3) non-profit. c3 non-profits have no accountability to business interests and instead operate solely for the public good (for, admittedly, very large values of both “public” and “good”). For instance, Creative Commons is a c3. Its mission and vision directly support the open sharing of data from government and academic institutions, as well as businesses, independent researchers, and any other data source:

Creative Commons develops, supports, and stewards legal and technical infrastructure that maximizes digital creativity, sharing, and innovation. Our vision is nothing less than realizing the full potential of the Internet — universal access to research and education, full participation in culture — to drive a new era of development, growth, and productivity.

Again, I find myself confused why The Linux Foundation felt it necessary to proliferate open data licenses in this way. Hopefully more information and statements are forthcoming to explain their fractionalising move.