[whatwg] Codecs for <audio> and <video>

[This message is bcc'ed to around 100 people who at some point or other sent comments to the WHATWG list on this topic.] After an inordinate amount of discussions, both in public and privately, on the situation regarding codecs for <video> and <audio> in HTML5, I have reluctantly come to the conclusion that there is no suitable codec that all vendors are willing to implement and ship. I have therefore removed the two subsections in the HTML5 spec in which codecs would have been required, and have instead left the matter undefined, as has in the past been done with other features like <img> and image formats, <embed> and plugin APIs, or Web fonts and font formats. The current situation is as follows: Apple refuses to implement Ogg Theora in Quicktime by default (as used by Safari), citing lack of hardware support and an uncertain patent landscape. Google has implemented H.264 and Ogg Theora in Chrome, but cannot provide the H.264 codec license to third-party distributors of Chromium, and have indicated a belief that Ogg Theora's quality-per-bit is not yet suitable for the volume handled by YouTube. Opera refuses to implement H.264, citing the obscene cost of the relevant patent licenses. Mozilla refuses to implement H.264, as they would not be able to obtain a license that covers their downstream distributors. Microsoft has not commented on their intent to support <video> at all. (Sorry if I've mischaracterised any positions above; the positions are relatively subtle and so it's likely that I have oversimplified matters.) I considered requiring Ogg Theora support in the spec, since we do have three implementations that are willing to implement it, but it wouldn't help get us true interoperabiliy, since the people who are willing to implement it are willing to do so regardless of the spec, and the people who aren't are not going to be swayed by what the spec says. Going forward, I see several (not mutually exclusive) possibilities, all of which will take several years: 1. Ogg Theora encoders continue to improve. Off-the-shelf hardware Ogg Theora decoder chips become available. Google ships support for the codec for long enough without getting sued that Apple's concern regarding submarine patents is reduced. => Theora becomes the de facto codec for the Web. 2. The remaining H.264 baseline patents owned by companies who are not willing to license them royalty-free expire, leading to H.264 support being available without license fees. => H.264 becomes the de facto codec for the Web. When either of these happen, I will reconsider updating HTML5 accodingly. The situation for audio codecs is similar, but less critical as there are more formats. Since audio has a much lower profile than video, I propose to observe the audio feature and see if any common codecs surface, instead of specifically requiring any. I will revisit this particular topic in the future when common codecs emerge. I would encourage proponents of particular codecs to attempt to address the points listed above, as eventually I expect one codec will emerge as the common codec, but not before it fulfills all these points: - is implementable without cost and distributable by anyone - has off-the-shelf decoder hardware chips available - is used widely enough to justify the extra patent exposure - has a quality-per-bit high enough for large volume sites This topic received hundreds of e-mails. Most covered the same points, and I have not replied to each one individually. I include below a small sample of some of the more interesting e-mails that were sent on this topic, along with some comments. On Wed, 21 Mar 2007, Asbjørn Ulsberg wrote: > > I think that specifying a mandatory baseline codec is so valuable that > it will be more gained than lost from doing it. It will enable authors > to use one baseline format in all of their videos without thinking about > browser support. Only if they choose another codec will they have to > test for support in browsers, because its support isn't required by the > HTML specification. I agree in principle. Sadly it seems that we are unable to force the issue through the spec. On Thu, 22 Mar 2007, Thomas Davies wrote: > > Having been pointed at this discussion by Christian, I thought I'd let > you know a bit more about where Dirac is as a royalty-free open source > codec. We're certainly very keen for Dirac to be considered as one of > the supported video formats. It's unclear to me why Dirac hasn't received as close investigation as Theora and H.264. I encourage you to approach the browser vendors directly and discuss it with them. I expect, however, that the situation is basically the same as with Theora (some UAs would be happy to support it; others would cite lack of off-the-shelf hardware decoders and an unclear patent landscape). > We have been developing Dirac hardware as well. Hardware for the > professional applications will be on sale in a very few weeks, and we're > developing a prototype hardware HDTV encoder too. Is the hardware support something that could be used by Apple in iPods? On Sat, 31 Mar 2007, Martin Atkins wrote: > > If there is no baseline codec in the specification, I firmly believe > that one of the following will happen: > > * Everyone will end up implementing whatever Microsoft implements. > * Microsoft won't implement <video> anyway, so no-one will use it. > > In practice, everyone's just mimicking whatever Microsoft does. At least > when they violate the spec they can be called on it; if what they do is > allowable by the spec, then everyone will have to copy it or they'll > have a useless browser. I think this gives the spec more power than it actually has. On Tue, 11 Dec 2007, Christian Montoya wrote: > On 12/11/07, ryan <ryan at theryanking.com> wrote: > > On Dec 11, 2007, at 11:28 AM, Christian Montoya wrote: > > > If even just 3 browsers, IE, Firefox, and Opera, supported OGG as a > > > de facto HTML standard, and Safari did its own thing, that would > > > still be a thousand times better than the crap we web developers > > > deal with now. > > > > Even though the spec doesn't require these vendors to support OGG, > > they can still do so. > > Yes, but if it is not required, then there is no way of telling whether > or not that support will be permanent. Sadly, even if the spec does require something, it's no guarantee that it'll remain implemented. The spec doesn't force browsers to do anything. Implementors only do the parts they want to do. > > How do you propose that the WHATWG help web developers without browser > > makers? > > By making OGG part of the spec. Unfortunately, it seems that this would not force Apple to implement it. On Tue, 11 Dec 2007, Dan Dorman wrote: > On Dec 11, 2007 9:06 AM, Joseph Harry <jharry at lapcat.org> wrote: > > One thing to remember, HTML is created by people who can be bought, > > and it is clearly what has happened here. > > Hey, let's not get carried away. Ian et al. have been working tirelessly > and scrupulously on this spec; there's no reason to cast aspersions on > anyone's character. Joseph is right that I can be bought... but sadly for me this has never happened with HTML5. :-( I guess I picked the wrong area to work in if I wanted to make money through bribes! On Tue, 11 Dec 2007, Fernando wrote: > > Please reconsider the decision to exclude the recommendation of the > Theora/OGG Vorbis codec in HTML 5 guidelines. > > I expect that in a sophisticated group such as this one: > > * skepticism with how well the interests of powerful corporations match > those of individuals that are not their employees or shareholders; > > * an understanding of the economic and civil rights damage being done to > the rights of individuals by proprietary formats; and > > * an understanding of the wisdom behind the original wording of this > portion of the document; > > Will enable you to see the need to readmit common sense and wisdom into > HTML 5 by including OGG. The problem is that at this point whether the spec requires Ogg or not won't affect which browsers support it. All the browsers that would support it do support it; the other browsers would just ignore that part of the spec, which seems like a bad precedent to set. On Tue, 11 Dec 2007, Jeff McAdams wrote: > > Wait...Apple and Nokia posit an potential patent threat as justification > to remove the text, but patent and other "Intellectual Property" reasons > aren't justification for putting it back? The text was removed not because of any specific reason Nokia or Apple gave, but because they won't implement the requirement. It actually doesn't matter what the reason is in terms of editing the spec. Mozilla could say "we don't want to support H.264 because numerology says that 264 is an evil number", the end result would be the same -- if a browser refuses to implement something, then we can't require it. (The reasons are relevant when trying to convince them to change their mind, of course, or when trying to find a solution that they would agree to instead -- I'm not saying we should ignore the reason altogether.) On Tue, 11 Dec 2007, Manuel Amador (Rudd-O) wrote: > > > > Actually those are pretty much the only reasons being taken into > > account here. Sadly, Ogg doesn't keep the Web free of IP licensing > > horrors, due to the submarine patent issue -- as Microsoft experienced > > with MP3 and with the Eolas patent over the past few years, for > > instance, even things that seem to have well-understood patent > > landscapes can be unexpectedly attacked by patent trolls. > > > > This does suggest we need patent reform, but in practice this is out > > of scope for HTML5's development. We can't design our spec on the > > assumption that the patent system will be reformed. > > Interesting. Finally patents have brought free multimedia innovation to > a standstill. Two quite long paragraphs to say "we admit defeat". No, that was just a tactical withdrawal. This e-mail here is the one that admits defeat. :-) > > In the absence of IP constraints, there are strong technical reasons > > to prefer H.264 over [Theora]. For a company like Apple, where the > > MPEG-LA licensing fee cap for H.264 is easily reached, the technical > > reasons are very compelling. > > [...] Sure, Theora simply can't compress as good as 264. But Theora is > free and its related patents have been irrevocably granted to the world. "In the absence of IP constraints" was a very important phrase. :-) > > The problem is that if the big players don't follow the spec, even the > > SHOULD requirements, then the spec is basically pointless. What we > > want isn't that some people support Ogg, what we fundamentally want is > > that _everyone_ support the same codec, whatever that may be. > > Therefore, put Ogg Vorbis/Theora in the spec, and let everyone implement > it. Putting Ogg Theora in the spec doesn't lead to Apple implementing it, it just leads to them ignoring that part of the spec. > The two bullies that don't want to implement it simply don't get > the content delivered to their machines YouTube isn't going to not support the iPhone just because Apple doesn't follow HTML5. They're just going to send the iPhone H.264 content (as they do now), leading to Apple's products using less bandwidth or having higher quality video that the implementations that did follow the spec. That doesn't sound like a particularly good win for the spec. > OR authors who would like to cater to bullies could use the JavaScript > posted in the News section of Ogg Theora that automatically turns > standards-conformant VIDEO into legacy crap. Brilliant and gracefully > degradable. <video> itself supports multiple sources, so there's no need for JavaScript to do this. But it does mean we end up with exactly the situation we're in now, with different implementations supporting different codecs and the spec not having any power over this. I would rather the spec not say anything than say something that will be ignored. > > I don't see how this affects Apple's stance here. Today they can get > > significant traction with just H.264 (for example, Google is also > > moving to H.264 and Apple can therefore implement YouTube applications > > on iPhone without using anything but H.264). With Ogg, they get very > > little traction, yet significant financial risk. > > That's no reason to NOT SUGGEST Ogg Vorbis / Theora. No one here is > saying that HTML5 should forbid proprietary codecs -- all we're claiming > for is the judicious and well-deserved mention of two free technologies > in a document that will be read by MILLIONS of people to come. And you > just killed that. HTML5 is not an advertising platform for free codecs. It's a description of what browsers (and other user agent classes) implement. > > Small companies aren't targetted by patent trolls. Only big (really > > big) companies are. > > And therefore they're deserving of more protection? Sounds like a > racket to me. I believe patent trolls pretty much are the definition of a racket, yes. > > I am sorry you perceive them this way. > > Be honest, don't tell me you're sorry because you are not. I am incredibly sorry about the state of video codecs in HTML5. Truly, I am. This is a terrible situation for the spec to be in. I wish we had good answers instead of this quagmirish deadlock. > You're sorry when something personally sad happens to someone you know, > not when there's a perfectly valid disagreement on an action you took. I really am sorry in this particular case. Possibly, having worked on HTML5 for the past few years, it has become like a person I know. :-) > [...snip text about fear and bad-faith tactics...] > > > If we require Ogg, then what will happen is the big players will > > support something else, then that will become the de-facto standard, > > and you will get screwed. What we _want_ is for everyone to support > > the same codec. We don't get that by having a SHOULD-level requirement > > for Ogg. > > Well, tough luck, you can't. That is indeed my conclusion also, at this time. > The next-best option is Ogg, that favors small independent content > producers. That seems to be what Opera, Mozilla, and Chrome are implementing. > But no-siree, we can't have that, can we? Well we can have the implementations, but there's not much point having the spec require it if it's not going to be followed by everyone. > > At the end of the day, the browser vendors have a very effective > > absolute veto on anything in the browser specs, > > You mean they have the power to derail a spec? They have the power to not implement the spec, turning the spec from a useful description of implementations into a work of fiction. > That's something I would have considered before the advent of Mozilla > Firefox. Mozilla also has the power of veto here. For example, if we required that the browsers implement H.264, and Mozilla did not, then the spec would be just as equally fictional as it would be if today we required Theora. On Thu, 13 Dec 2007, Shannon wrote: > > Ian, are you saying that not implementing a SHOULD statement in the spec would > make a browser non-compliant with HTML5? > Are you saying that if a vendor does not implement the OPTIONAL Ogg support > then they would not use HTML5 at all? No, I'm just saying that there's not much point requiring a codec unless everyone implements it. We don't gain anything saying "you can do Theora, or you can do something else, you know, whatever you feel like". Generally speaking, we don't specify what other formats are to be supported by an HTML implementation, the only reason to make an exception would be if we could get uniform support across all implementations. > What will it take to get this (apparently unilateral) change revoked? I would be happy to change the spec as soon as all the implementors are willing to implement a common codec. > Finally, what is Google/YouTube's official position on this? As I understand it, based on other posts to this mailing list in recent days: Google ships both H.264 and Theora support in Chrome; YouTube only supports H.264, and is unlikely to use Theora until the codec improves substantially from its current quality-per-bit. On Mon, 31 Mar 2008, Robert J Crisler wrote: > > I notice that HTML5's video section is incomplete and lacking. > > The text under 3.12.7.1 could have been written ten years ago: > > "It would be helpful for interoperability if all browsers could support > the same codecs. However, there are no known codecs that satisfy all the > current players: we need a codec that is known to not require per-unit > or per-distributor licensing, that is compatible with the open source > development model, that is of sufficient quality as to be usable, and > that is not an additional submarine patent risk for large companies. > This is an ongoing issue and this section will be updated once more > information is available." > > The time has come for the W3C to swallow a bit of pride and cede this > control, this area, to the Motion Picture Experts Group. While MPEG does > not produce a codec that is free of any licensing constraints, the > organization has produced a codec, actually several, that are world > standards. You may have a digital cable or satellite service (that's > MPEG-2 or MPEG-4). You may have a DVD player (MPEG-2), or a Blu-Ray > player (MPEG-4). You may have an iPod (MPEG-4). And you may have heard > of MP3. > > The time has come for the W3C, despite misgivings, to support an ISO/IEC > organization that is charged with the development of video and audio > encoding standards. We can't have a separate set of standards for web > distribution. It simply complicates workflows and stunts any potential > transition to the web as the dominant distribution mechanism for such > media. > > Whatever the misgivings, it's time to say that the ISO/IEC standards are > preferable to proprietary codecs (Windows Media, Flash), and that MPEG-4 > AVC is recommended over other codecs for video. It would be really great > if an intrepid group of smart people were to come up with something > technically superior to MPEG-4, make it a world standard for encoding > audio and video, and make it available without any patent or royalty > constraints. That has not happened, despite some strong efforts > particularly from the OGG people, and it's time to acknowledge that fact > and stop holding out. > > Again, the W3C should cede these issues to the ISO/IEC standards > organization set up for the purpose of defining world standards in video > and audio compression and decompression. Unfortunately, the organisation to which you refer does not create a standard that can be implemented in and distributed by free and open source software projects, so it doesn't really solve the problem at all. On Tue, 1 Apr 2008, David Gerard wrote: > > The actual solution is a large amount of compelling content in Theora or > similar. Wikimedia is working on this, though we're presently hampered > by a severe lack of money for infrastructure and are unlikely to have > enough in time for FF3/Webkit/HTML5. Having significant content using Theora would definitely be one way to address this logjam, in that it would encourage hardware manufacturers to support Theora, and would encourage companies like Apple to support Theora despite the increased patent exposure. On Fri, 4 Apr 2008, Robert J Crisler wrote: > > The W3C, by offering no actionable advice on standards support in this > area, is implying by omission that any of the existing formats is just > as good for interoperability as any other. I think in general principle > that it would be better to "bless" (great word, and that's just it) > MPEG-4 AVC for the present, despite its legal encumbrances, and to > continue to press for a technically-excellent format that does not have > those encumbrances. At this point, if a video publisher wants his video to work with existing <video> implementations, Theora and H.264 are the two codecs that are the most effective in each implementation. I don't think we need to provide much guidance at this point. People aren't going to use, say, RealPlayer's format, because it's not supported and so wouldn't work. > The W3C is not only about web standards. It's also the road map. Right > now, that road map, where video is concerned, says the following: "User > agents may support any video and audio codecs and container formats." It > might as well say "Here be dragons." I think it's time, at the very > least, to say goodbye to single-company proprietary dreck. To say both > that existing international standards are OK for now, but the ideal as > currently expressed in the boxed copy under 3.12.7.1 is still not met. Why is this the case for video but not images? We don't require a particular image format for <img> either, but people know you can just PNG and JPEG. On Fri, 29 May 2009 jjcogliati-whatwg at yahoo.com wrote: > > I propose that a MPEG-1 subset should be considered as the required > codec for the HTML-5 video tag. > > == MPEG-1 Background == > > MPEG-1 was published as the ISO standard ISO 11172 in August 1993. It > is a widely used standard for audio and video compression. Both > Windows Media and Apple Quicktime support playing MPEG-1 videos using > Audio Layer 2. MPEG-1 provides three different audio layers. The > simplest is Audio Layer 1 and the most complicated is Audio Layer 3, > usually known as MP3. Since MPEG-1 includes MP3, a full implementation > of a MPEG-1 decoder would not be royalty free until either all the > essential MP3 patents expire, or a royalty free license is granted for > all the essential MP3 patents. > > == MPEG-1 PRF == > > I propose the following subset of MPEG-1 as the MPEG-1 Potentially > royalty free subset (MPEG-1 PRF): > > MPEG-1 Video without: > forward and backward prediction frames (B-frames) > dc-pictures (D-frames) > > MPEG-1 Audio Layers 1 and 2 only (no Layer 3 audio) > > This subset eliminates the currently patented MP3 portion of the > MPEG-1 Audio. It also eliminates the non-needed B-frames and D-frames > because there is less prior art for them and this has the side effect > of simplifying MPEG-1 PRF decoding. > > == Patents == > > To the best of my knowledge, there are no essential patents on this > MPEG-1 PRF subset. I have discussed this on a kuro5hin article, a > post on the gstreamer mailing list and the MPEG-1 discussion page at > Wikipedia, and no-one has been able to definitively list any patents on > this subset. > > http://www.kuro5hin.org/story/2008/7/18/232618/312 > http://sourceforge.net/mailarchive/message.php?msg_id=257198.16969.qm%40web62405.mail.re1.yahoo.com > http://en.wikipedia.org/wiki/Talk:MPEG-1#Can_MPEG-1_be_used_without_Licensing_Fees.3F > > That said, "absence of evidence is not evidence of absence". There > still may certainly be patents on MPEG-1 PRF. Next I will discuss > some prior art that exists for this subset. > > == Prior Art for MPEG-1 PRF == > > The H.261 (12/90) specification contains most of the elements that > appear in MPEG-1 video with the exception of the B-Frames and > D-frames. H.261 however only allows 352 x 288 and 176 x 144 sized > video. H.261 is generally considered to be royalty free (such as by > the OMS video project). There are no unexpired US patents listed for it on > the ITU patent database. > > http://www.itu.int/rec/T-REC-H.261 > http://www.itu.int/ipr/IPRSearch.aspx?iprtype=PS > http://blogs.sun.com/openmediacommons/entry/oms_video_a_project_of > > As for MPEG-1 Audio Layer 2, it is very close to MASCAM, which was > described in "Low bit-rate coding of high-quality audio signals. An > introduction to the MASCAM system" by G. Thiele, G. Stoll and M. Link, > published in EBU Technical Review, no. 230, pp. 158-181, August 1988 > > The Pseudo-QMF filter bank used by Layer 2 is similar to that > described in H. J. Nussbaumer. "Pseudo-QMF Filter Bank", IBM technical > disclosure bulletin., Vol 24. pp 3081-3087, November 1981. > > The MPEG-1 committee draft was publicly available as ISO CD 11172 by > December 6, 1991. There is only a few year window for patents to have > been filed before this counts as prior art, and not have expired. > > This list of prior art is by no means complete, in that there > certainly could be patents that are essential for a MPEG-1 PRF > implementation, but can not be invalided by this list of prior art. > > In the US, patents filed before 1995 last the longer of 20 years after > they are filed or 17 years after they are granted. They also have to > be filed within a year of the first publication of the method. This > means that for US patents, most (that is all that took less than three > years to be granted) patents that could apply to MPEG-1 will be > expired by December 2012 (21 years after the committee draft was > published.). > > > == Brief comparison to other video codecs == > > Motion JPEG with PCM audio is the only codec that I know of that can > be played in a stock Windows, Linux and Mac OS X setup. On the other > hand, since it is basically a series of JPEG images and a 'WAV' file, > the compression is much poorer than MPEG-1 PRF. > > Ogg Theora and Ogg Vorbis are newer standards than MPEG-1. My guess > is that they can do substantially better at compression than MPEG-1. > Assuming there are no submarine patents, I think the OGG codecs would > be a better choice than MPEG-1. If you think that MPEG-1 PRF is not > royalty free, but Ogg Theora and Ogg Vorbis are, you may find that > comparing Theora to H.261 or Theora and Vorbis to MPEG-1 PRF is an > enlightening exercise. Much of what is in MPEG-1 PRF is also in Ogg > Theora and Ogg Vorbis. > > MPEG-2 is the next MPEG standard. It mainly adds error correction and > interlacing. Neither of these features is particularly important for > streaming video to computer monitors using a reliable data transport. > MPEG-2 definitely is patented, and will be until at least the 2018 > time-frame. I don't think that this buys much over MPEG-1 PRF, and it > definitely adds more patent issues. > > MPEG-4, H.264 have better codecs than MPEG-1, but these have a long > time till the patents expire, so are unsuitable for use royalty free. > > == Remaining Work == > > I am not a lawyer. In order to use MPEG-1 PRF, patent lawyers will > have to investigate the patent issue and publicly report on the > patent status. Unless there is a report sitting around that can be > published, this will likely be expensive. > > As well, the prior art review is not complete. The biggest missing > piece is synthesis window for the audio layer. > > It would be useful if there is any large company that uses MPEG-1 who > does not have a MPEG-2 or MPEG-4 license. One possible example of > this might be a software only video CD player. > > I created a wikia page to put up information on MPEG-1 status: > http://scratchpad.wikia.com/wiki/MPEG_patent_status > > == Satisfaction of requirements == > > >From 4.8.7.1 HTML 5 draft: > 1. does not require per-unit or per-distributor licensing > Probably. There does not seem to be anyone requesting this kind of > licensing right now. > > 2. Must be compatible with the open source development model. > Probably. There does not seem to be any identified patents for MPEG-1 PRF. > > 3. Is of sufficient quality as to be usable > Yes. Much better than the next best option of Motion JPEG. Probably > worse than Ogg Theora or H.264. > > 4. Is not an additional submarine patent risk for large companies. > Probably. It has been widely implemented (in DVD players, in Apple > Quicktime and Microsoft Media Player) Note that these example uses > have either a license for MPEG-2 or MPEG-4 however. > > == Conclusion == > > The MPEG-1 PRF subset defined here seems to fit all the requirements > of a codec for video for HTML5. It seems to be patent free. A final > conclusion will depend on whether or not patent lawyers can sign off > on this proposal and if the quality of MPEG-1 PRF is deemed > sufficient. > > == Disclaimers == > > I am not a lawyer. These are my own views. I probably made > mistakes. Please correct me where I am wrong. Thank you for this detailed proposal. The only point I would make, and probably the reason why this proposal hasn't been implemented by browser vendors, is in the following: > 3. Is of sufficient quality as to be usable > Yes. Much better than the next best option of Motion JPEG. Probably > worse than Ogg Theora or H.264. MPEG-1 is nowhere near good enough at this point to be a serious contender. There have been suggestions that even Theora isn't good enough yet (for example, YouTube won't use Theora with the current state of encoders), an it _far_ outperforms MPEG-1. -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'