Full disclosure: I’m one of the primary authors and editors of the JSON-LD specification. I am also the chair of the group that created JSON-LD and have been an active participant in a number of Linked Data initiatives: RDFa (chair, author, editor), JSON-LD (chair, co-creator), Microdata (primary opponent), and Microformats (member, haudio and hvideo microformat editor). I’m biased, but also well informed.

JSON-LD has been getting a great deal of good press lately. It was adopted by Google, Yahoo, Yandex, and Microsoft for use in schema.org. The PaySwarm universal payment protocol is based on it. It was also integrated with Google’s Gmail service and the open social networking folks have also started integrating it into the Activity Streams 2.0 work.

That all of these positive adoption stories exist was precisely the reason why Shane Becker’s post on why JSON-LD is an Unneeded Spec was so surprising. If you haven’t read it yet, you may want to as the rest of this post will dissect the arguments he makes in his post (it’s a pretty quick 5 minute read). The post is a broad brush opinion piece based on a number of factual errors and misinformed opinion. I’d like to clear up these errors in this blog post and underscore some of the reasons JSON-LD exists and how it has been developed.

A theatrical interpretation of the “JSON-LD is Unneeded” blog post

Shane starts with this claim:

Today I learned about a proposed spec called JSON-LD. The “LD” is for linked data (Linked Data™ in the Uppercase “S” Semantic Web sense).

When I started writing the original JSON-LD specification, one of the goals was to try and merge lessons learned in the Microformats community with lessons learned during the development of RDFa and Microdata. This meant figuring out a way to marry the lowercase semantic web with the uppercase Semantic Web in a way that was friendly to developers. For developers that didn’t care about the uppercase Semantic Web, JSON-LD would still provide a very useful data structure to program against. In fact, Microformats, which are the poster-child for the lowercase semantic web, were supported by JSON-LD from day one.

Shane’s article is misinformed with respect to the assertion that JSON-LD is solely for the uppercase Semantic Web. JSON-LD is mostly for the lowercase semantic web, the one that developers can use to make their applications exchange and merge data with other applications more easily. JSON-LD is also for the uppercase Semantic Web, the one that researchers and large enterprises are using to build systems like IBM’s Watson supercomputer, search crawlers, Gmail, and open social networking systems.

Linked data. Web sites. Standards. Machine readable.

Cool. All of those sound good to me. But they all sound familiar, like we’ve already done this before. In fact, we have.





We haven’t done something like JSON-LD before. I wish we had because we wouldn’t have had to spend all that time doing research and development to create the technology. When writing about technology, it is important to understand the basics of a technology stack before claiming that we’ve “done this before”. An astute reader will notice that at no point in Shane’s article is any text from the JSON-LD specification quoted, just the very basic introductory material on the landing page of the website. More on this below.

Linked data

That’s just the web, right? I mean, we’ve had the <a href> tag since literally the beginning of HTML / The Web. It’s for linking documents. Documents are a representation of data.

Speaking as someone that has been very involved in the Microformats and RDFa communities, yes, it’s true that the document-based Web can be used to publish Linked Data. The problem is that standard way of expressing a link to another piece of data that can be followed did not carry over to the data-based Web. That is, most JSON-based APIs don’t have a standard way of encoding a hyperlink.

The other implied assertion with the statement above is that the document-based Web is all we need. If this were true, sending HTML documents to Web applications would be all we needed. Web developers know that this isn’t the case today for a number of obvious reasons. We send JSON data back and forth on the Web when we need to program against things like Facebook, Google, or Twitter’s services. JSON is a very useful data format for machine-to-machine data exchange. The problem is that JSON data has no standard way of doing a variety of things we do on the document-based Web, like expressing links, expressing the types of data (like times and dates), and a variety of other very useful features for the data-based Web. This is one of the problems that JSON-LD addresses.

Web sites

If it’s not wrapped in HTML and viewable in a browser it, is it really a website? JSON isn’t very useful in the browser by itself. It’s not style-able. It’s not very human-readable. And worst of all, it’s not clickable.

Websites are composed of many parts. It’s a weak argument to say that if a site is mainly composed of data that isn’t in HTML, and isn’t viewable in a browser, that it’s not a real website. The vast majority of websites like Twitter and Facebook are composed of data and API calls with a relatively thin varnish of HTML on top. JSON is the primary way that applications interact with these and other data-driven websites. It’s almost guaranteed these days that any company that has a popular API uses JSON in their Web service protocol.

Shane’s argument here is pretty confused. It assumes that the primary use of JSON-LD is to express data in an HTML page. Sure, JSON-LD can do that, but focusing on that brush stroke is missing the big picture. The big picture is that JSON-LD allows applications that use it to share data and interoperate in a way that is not possible with regular JSON, and it’s especially useful when used in conjunction with a Web service or a document-based database like MongoDB or CouchDB.

Standards based

To their credit, JSON-LD did license their website content Creative Commons CC0 Public Domain. But, the spec itself isn’t. It’s using (what seems to be) a W3C boilerplate copyright / license. Copyright © 2010-2013 W3C® (MIT, ERCIM, Keio, Beihang), All Rights Reserved. W3C liability, trademark and document use rules apply.





Nope. The JSON-LD specification has been released under a Creative Commons Attribution 3.0 license multiple times in the past, and it will be released under a Creative Commons license again, most probably CC0. The JSON-LD specification was developed in a W3C Community Group using a Creative Commons license and then released to be published as a Web standard via W3C using their W3C Community Final Specification Agreement (FSA), which allows the community to fork the specification at any point in time and publish it under a different license.

When you publish a document through the W3C, they have their own copyright, license, and patent policy associated with the document being published. There is a legal process in place at W3C that asserts that companies can implement W3C published standards in a patent and royalty-free way. You don’t get that with CC0, in fact, you don’t get any such vetting of the technology or any level of patent and royalty protection.

What we have with JSON-LD is better than what is proposed in Shane’s blog post. You get all of the benefits of having W3C member companies vet the technology for technical and patent issues while also being able to fork the specification at any point in the future and publish it under a license of your choosing as long as you state where the spec came from.

Machine readable

Ah… “machine readable”. Every couple of years the current trend of what machine readable data should look like changes (XML/JSON, RSS/Atom, xml-rpc/SOAP, rest/WS-*). Every time, there are the same promises. This will solve our problems. It won’t change. It’ll be supported forever. Interoperability. And every time, they break their promises. Today’s empires, tomorrow’s ashes.





At no point has any core designer of JSON-LD claimed 1) that JSON-LD will “solve our problems” (or even your particular problem), 2) that it won’t change, and 3) that it will be supported forever. These are straw-man arguments. The current consensus of the group is that JSON-LD is best suited to a particular class of problems and that some developers will have no need for it. JSON-LD is guaranteed to change in the future to keep pace with what we learn in the field, and we will strive for backward compatibility for features that are widely used. Without modification, standardized technologies have a shelf life of around 10 years, 20-30 if they’re great. The designers of JSON-LD understand that, like the Web, JSON-LD is just another grand experiment. If it’s useful, it’ll stick around for a while, if it isn’t, it’ll fade into history. I know of no great software developer or systems designer that has ever made these three claims and been serious about it.

We do think that JSON-LD will help Web applications interoperate better than they do with plain ‘ol JSON. For an explanation of how, there is a nice video introducing JSON-LD.

With respect to the “Today’s empires, tomorrow’s ashes” cynicism, we’ve already seen a preview of the sort of advances that Web-based machine-readable data can unleash. Google, Yahoo!, Microsoft, Yandex, and Facebook all use a variety of machine-readable data technologies that have only recently been standardized. These technologies allow for faster, more accurate, and richer search results. They are also the driving technology for software systems like Watson. These systems exist because there are people plugging away at the hard problem of machine readable data in spite of cynicism directed at past failures. Those failures aren’t ashes, they’re the bedrock of tomorrow’s breakthroughs.

Instead of reinventing the everything (over and over again), let’s use what’s already there and what already works. In the case of linked data on the web, that’s html web pages with clickable links between them.

Microformats, Microdata, and RDFa do not work well for data-based Web services. Using Linked Data with data-based Web services is one of the primary reasons that JSON-LD was created.

For open standards, open license are a deal breaker. No license is more open than Creative Commons CC0 Public Domain + OWFa. (See also the Mozilla wiki about standards/license, for more.) There’s a growing list of standards that are already using CC0+OWFa.

I think there might be a typo here, but if not, I don’t understand why open licenses are a deal breaker for open standards. Especially things like the W3C FSA or the Creative Commons licenses we’ve published the JSON-LD spec under. Additionally, CC0 + OWFa might be neat. Shane’s article was the first time that I had heard of OWFa and I’d be a proponent for pushing it in the group if it granted more freedom to the people using and developing JSON-LD than the current set of agreements we have in place. After glossing over the legal text of the OWFa, I can’t see what CC0 + OWFa buys us over CC0 + W3C patent attribution. If someone would like to make these benefits clear, I could take a proposal to switch to CC0 + OWFa to the JSON-LD Community Group and see if there is interest in using that license in the future.

No process is more open than a publicly editable wiki.

A counter-point to publicly accessible forums

Publicly editable wikis are notorious for edit wars, they are not a panacea. Just because you have a wiki, does not mean you have an open community. For example, the Microformats community was notorious for having a different class of unelected admins that would meet in San Francisco and make decisions about the operation of the community. This seemingly innocuous practice would creep its way into the culture and technical discussion on a regular basis leading to community members being banned from time to time. Similarly, Wikipedia has had numerous issues with publicly editable wikis and the behavior of their admins.

Depending on how you define “open”, there are a number of processes that are far more open than a publicly editable wiki. For example, the JSON-LD specification development process is completely open to the public, based on meritocracy, and is consensus-driven. The mailing list is open. The bug tracker is open. We have weekly design teleconferences where all the audio is recorded and minuted. We have these teleconferences to this day and will continue to have them into the future because we make transparency a priority. JSON-LD, as far as I know, is the first such specification in the world developed where all the previously described operating guidelines are standard practice.

(Mailing lists are toxic.)

A community is as toxic as its organizational structure enables it to be. The JSON-LD community is based on meritocracy, consensus, and has operated in a very transparent manner since the beginning (open meetings, all calls are recorded and minuted, anyone can contribute to the spec, etc.). This has, unsurprisingly, resulted in a very pleasant and supportive community. That said, there is no perfect communication medium. They’re all lossy and they all have their benefits and drawbacks. Sometimes, when you combine multiple communication channels as a part of how your community operates, you get better outcomes.

Finally, for machine readable data, nothing has been more widely adopted by publishers and consumers than microformats. As of June 2012, microformats represents about 70% of all of the structured data on the web. And of that ~70%, the vast majority was h-card and xfn. (All RDFa is about 25% and microdata is a distant third.)

Microformats are good if all you need to do is publish your basic contact and social information on the Web. If you want to publish detailed product information, financial data, medical data, or address other more complex scenarios, Microformats won’t help you. There have been no new Microformats released in the last 5 years and the mailing list traffic has been almost non-existent for around 5 years. From what I can tell, most everyone has moved on to RDFa, Microdata, or JSON-LD.

There are a few that are working on Microformats 2, but I haven’t seen anything that it provides that is not already provided by existing solutions that also have the added benefit of being W3C standards or backed by major companies like Google, Facebook, Yahoo!, Microsoft, and Yandex.

Maybe it’s because of the ease of publishing microformats. Maybe it’s the open process for developing the standards. Maybe it’s because microformats don’t require any additions to HTML. (Both RDFa and microdata required the use of additional attributes or XML namespaces.) Whatever the reason, microformats has the most uptake. So, why do people keep trying to reinvent what microformats is already doing well?

People aren’t reinventing what Microformats are already doing well, they’re attempting to address problems that Microformats do not solve.

For example, one of the reasons that Google adopted JSON-LD is because markup was much easier in JSON-LD than it was in Microformats, as evidenced by the example below:

Back to JSON-LD. The “Simple Example” listed on the homepage is a person object representing John Lennon. His birthday and wife are also listed on the object. { "@context": "http://json-ld.org/contexts/person.jsonld", "@id": "http://dbpedia.org/resource/John_Lennon", "name": "John Lennon", "born": "1940-10-09", "spouse": "http://dbpedia.org/resource/Cynthia_Lennon" } I look at this and see what should have been HTML with microformats (h-card and xfn). This is actually a perfect use case for h-card and xfn: a person and their relationship to another person. Here’s how it could’ve been marked up instead. <div class="h-card"> <a href="http://dbpedia.org/resource/John_Lennon" class="u-url u-uid p-name">John Lennon</a> <time class="dt-bday" datetime="1940-10-09">October 9<sup>th</sup>, 1940</time> <a rel="spouse" href="http://dbpedia.org/resource/Cynthia_Lennon">Cynthia Lennon</a>. </div>

I’m willing to bet that most people familiar with JSON will find the JSON-LD markup far easier to understand and get right than the Microformats-based equivalent. In addition, sending the Microformats markup to a REST-based Web service would be very strange. Alternatively, sending the JSON-LD markup to a REST-based Web service would be far more natural for a modern day Web developer.

This HTML can be easily understood by machine parsers and humans parsers. Microformats 2 parsers already exists for: JavaScript (in the browser), Node.js, PHP and Ruby. HTML + microformats2 means that machines can read your linked data from your website and so can humans. It means that you don’t need an “API” that is something other than your website.

You have been able to do the same thing, and much more, using RDFa and Microdata for far longer (since 2006) than you have been able to do it in Microformats 2. Let’s be clear, there is no significant advantage to using Microformats 2 over RDFa or Microdata. In fact, there are a number of disadvantages for using Microformats 2 at this point, like little to no support from the search companies, very little software tooling, and an anemic community (of which I am a member) for starters. Additionally, HTML + Microformats 2 does not address the Web service API issue at all.

Please don’t waste time and energy reinventing all of the wheels. Instead, please use what already works and what works the webby way.





Do not miss the irony of this statement. RDFa has been doing what Microformats 2 does today since 2006, and it’s a Web standard. Even if you don’t like RDFa 1.0, RDFa 1.1, RDFa Lite 1.1, and Microdata all came before Microformats 2. To assert that wheels should not be reinvented and then claim that Microformats 2, which was created far after there were already a number of well-established solutions, is quite a strange position to take.

Conclusion

JSON-LD was created by people that have been directly involved in the Linked Data, lowercase semantic web, uppercase Semantic Web, Microformats, Microdata, and RDFa work. It has proven to be useful to them. There are a number of very large technology companies that have adopted JSON-LD, further underscoring its utility. Expect more big announcements in the next six months. The JSON-LD specifications have been developed in a radically open and transparent way, the document copyright and licensing provisions are equally open. I hope that this blog post has helped clarify most of the misinformed opinion in Shane Becker’s blog post.

Most importantly, cynicism will not solve the problems that we face on the Web today. Hard work will, and there are very few communities that I know of that work harder and more harmoniously than the excellent volunteers in the JSON-LD community.

If you would like to learn more about Linked Data, a good video introduction exists. If you want to learn more about JSON-LD, there is a good video introduction to that as well.