I’ve been following WebRTC (Real Time Communications) because (1) it is probably the most significant addition to the web in terms of enabling a new class of applications at least since the introduction of Ajax (1998, standardized by 2006), and perhaps since the introduction of Javascript (1995, standardized by 1997). The IETF working group charter puts it well (another part of the work is at W3C):

There are a number of proprietary implementations that provide direct interactive rich communication using audio, video, collaboration, games, etc. between two peers’ web-browsers. These are not interoperable, as they require non-standard extensions or plugins to work. There is a desire to standardize the basis for such communication so that interoperable communication can be established between any compatible browsers. The goal is to enable innovation on top of a set of basic components. One core component is to enable real-time media like audio and video, a second is to enable data transfer directly between clients.

(See pad.textb.org (source) for one simple application; simpleWebRTC seems to be a popular library for building WebRTC applications.)

And (2) because WebRTC is the scene of the latest fight to protect open web standards from rent seekers.

The IETF working group is choosing between H.264 Constrained Baseline Profile Level 1.2 and VP8 as the Mandatory To Implement (MTI) video codec (meaning all applications can count on that codec being available) for WebRTC. H.264 cannot be included in free and open source software, VP8 can, due to their respective patent situations. (For audio-only WebRTC applications, the free Opus codec seems to be a non-controversial requirement.)

Cisco has recently promised that in 2014 they will make available a binary implementation of H.264 for which they will pay license fees for all comers (there is an annual cap on fees, allowing them to do this). That’s nice of them, but the offer is far from ideal for any software (a binary must be downloaded from Cisco servers for each user), and a nonstarter for applications without some kind of plugin system, and for free and open source software distributions, which must be able to modify source code.

Last week I remotely attended a meeting on the MTI video codec choice. No consensus was reached; discussion continues on the mailing list. One interesting thing about the non-consensus was the split between physical attendees (50% for H.264 and 30% for VP8) and remote attendees (20% for H.264, 80% for VP8). A point mentioned several times was the interest of “big players” (mostly fine with paying H.264 fees, and are using it in various other products) and “little players” (fees are significant, eg startups, or impossible, eg free and open source projects); depending on one’s perspective, the difference shows how venue biases participation in one or both directions.

Jonathan Rosenberg, the main presenter for H.264, at about 22 minutes into a recording segment:

I would love it if all patents evaporated, if all the stuff was open source in ways that we could use, and we didn’t have to deal with any of this mess.

The argument for why H.264 is the best choice for dealing with “this mess” boils down to H.264 having a longer history and broader adoption than VP8 (in other applications; the two implementation of WebRTC so far, in recent versions of Chrome and Firefox, so far exclusively use VP8).

Harald Alvestrand, the main presenter for VP8, at about 48 minutes into another recording segment:

Development of codecs has been massively hampered and held back by the fact that it has been done in a fashion that has served to maximize the patent encumbrances on codecs. Sooner or later, we should see a way forward to abandon the dependence on encumbered codecs also for video software. My question, at this juncture, is if not now, when?

Unsurprisingly, I find this (along with the unworkability of H.264 for free and open source software) a much more compelling argument. The first step toward making patents evaporate (or at least irrelevant for digital video) is to select a codec which has been developed to maximize freedom, rather than developed to maximize encumbrances and rent collection.

What are individuals and entities pushing H.264 as the best codec for now, given the mess, doing for the longer term? Are they working on H.265, in order to bake in rents for the next generation? Or are they contributing to VP9, the next-next generation Daala, and the elimination of software patents?

Addendum: Version of this post sent to rtcweb@ietf.org (and any followups).