When Apple discussed the new features of the forthcoming iPhone OS 3.0, SVP of iPhone Software Engineering Scott Forstall said that the iPhone would be capable of streaming video and audio directly over HTTP. Apple also advertised HTTP streaming as a feature of QuickTime X, the update of its media architecture coming in Snow Leopard. What it failed to explain, at least publicly, is how this streaming would be accomplished. Fortunately, Apple submitted its proposed protocol last month to the Internet Engineering Task Force (IETF) in the hopes that it will become a ubiquitous standard.

Apple identified what it considers a few issues with standard streaming, which generally uses the Real Time Streaming Protocol originally developed by Netscape and Real in the late '90s. The biggest issue with RTSP is that the protocol or its necessary ports may be blocked by routers or firewall settings, preventing a device from accessing the stream. As the standard protocol for the Web, though, HTTP is generally accessible. Furthermore, no special server is required other than a standard HTTP server, which is more widely supported in content distribution networks, and more expertise in optimizing HTTP delivery is generally available than for RTSP.

Enter HTTP Live Streaming. The basic mechanics involve using software on the server to break an MPEG-2 transport stream into small chunks saved as separate files, and an extension to the .m3u playlist specification (.m3u8) to tell the client where to get the files that make up the complete stream. The media player client merely downloads and plays the small chunks in the order specified in the playlist, and in the case of a live stream, periodically refreshes the playlist to see if there have been any new chunks added to the stream.

This is in contrast to real-time streaming, as there would necessarily be a minimum latency of whatever duration the server slices the stream into (Apple refers to 10 seconds as an example). As the server encodes the video and slices it into 10 second clips, for instance, it creates or updates a playlist for the stream with the URL of the next clip. The client begins by downloading one or more of the clips, playing them in order. As one clip plays, the client begins downloading the next specified clip until it reaches a tag in the playlist that signals the end of the stream.

The protocol offers a way to specify alternate streams by pointing to separate playlists for each alternate stream. These generally would be of different quality and bandwidth requirements, so the client can request an appropriate stream for whatever network conditions allow. The client can also change to any of the alternate streams as needed, "such as when a mobile device enters or leaves a WiFi hotspot," according to Apple. So if your iPhone moves out of WiFi range and switches to 3G, the QuickTime player could request a lower bandwidth stream and begin downloading the alternate, smaller chunks instead.

Further, the protocol allows for the individual media clips to be encrypted so that broadcasters can limit access to paid subscribers, for instance. In this case, key files for decoding the encrypted clips are referenced in the playlist, and the client uses the key files to decrypt each one before playing. There is also a flag that broadcasters can set to disallow caching of individual media files as they are downloaded.

The only requirement is that the media must be formatted as an MPEG-2 transport stream, program stream, or audio elementary stream. Apple's current implementation uses (unsurprisingly) H.264 video with AAC audio, though audio-only streams can use AAC, MP3, or the MPEG-2 elementary stream. The version of QuickTime in iPhone OS 3.0 is compatible with these formats, as is the version of QuickTime that will ship with Snow Leopard. Apple also has a beta version of a stream segmenter—currently only available to Apple Developer Connection members—to slice up a stream into individual files, create .m3u8 playlist files, and handle encryption and key generation.

Since the entire method works with standard HTTP transport and essentially any off-the-shelf software or hardware encoder can make the necessary MPEG-2 stream, it opens up streaming to nearly anyone. What Apple doesn't say explicitly is that its protocol can negate the need for proprietary solutions such as Adobe's Flash or Microsoft's Silverlight to deliver remotely hosted or encrypted content. And since neither of those are likely to appear on the iPhone anytime soon, it's one of the few ways to stream live video reliably to Apple's mobile platform.

Currently the standard is an Internet-Draft, and we've yet to see any evidence that others are ready to jump on Apple's bandwagon. But given the issues Apple has identified with RTSP, the proprietary nature of other common solutions, and the inclusion of support for encrypted streams, it has potential to appeal to a wide swath of content providers.

Further Reading:

iPhone Reference Library: HTTP Live Streaming Overview

The beta of Apple's Stream Segmenter software, with documentation, can be downloaded from ADC if you have a valid ADC membership.

Listing image by Photo CC Richard Masoner