When I think of video streaming, I typically think of Netflix, Amazon or YouTube — the major providers of video on the web today. When I think of “video on the web,” I generally think about pages with a background video, or another fun immersive feature that makes the page engaging. Generally (as a web developer), I assume that these pages use the video tag, not streaming.

However, if the top companies delivering video content on the web are streaming their videos – and most other pages are not – is there something that can be learned here? Could streaming video be faster, more efficient, and have a better customer experience? I believe the answer is yes, but there is a little legwork that must be done to ensure that your video is streaming to its optimal potential.

Finally, setting up video streaming is fairly straightforward, and no longer requires any specialised hardware – any HTTP server can serve streaming content.

What is Adaptive Bitrate Streaming?

When delivering a video with the video tag…

<video src="myvideo_1080.mp4">

… there is just one file — no matter the size of the screen, or the quality of the network. In the example above, we can assume that the video is 1080p, which may be larger than required for mobile handsets. The additional pixels will make the file larger, and therefore take longer to arrive on slower networks.

How can we deliver smaller videos? Perhaps we could write some JavaScript to deliver “myvideo_1080.mp4” to devices with large screens, and “myvideo_720.mp4” just to mobile devices. This would certainly be a step forward, but what if the mobile device was on fast Wi-Fi and could have handled the faster video? Or what if the device with the larger screen was on a slower network (or the network conditions changed during playback)?

This is where adaptive bitrate shines. Each video is created with several different streams, at different resolutions and quality (read: size of the video). Then each of these streams is divided into segment (or chunks), each typically two to five seconds long.

Now the player on the device has a number of different options for playing video, depending on the device and the network. For advanced players, if the network changes, and it appears that the video can no longer download fast enough, a switch can be made to a more appropriate bitrate video — in the middle of playback!

Also, if the stream has separate audio and video tracks, the player can choose the appropriate audio stream, based on the browser language (eg German for German speakers).

With all of these features, a well-built streaming video will deliver precisely the right quality for the device and the available network speed.

How does ABS work?

We will use the HTTP Live Streaming (HLS) format to understand how streaming works. HLS is currently the most popular format for building video streams (another popular format is MPEG-DASH).

When a video is delivered to the player on the device, the first file to arrive is the manifest file. This file is the “menu” of streams available for playback. The manifest lists all available streams, audio channels, subtitles that the player can utilise in video playback. For simplicity, let’s start with a simple manifest with just video files:

#EXTM3U

#EXT-X-STREAM-INF:BANDWIDTH=1566000,CODECS="avc1.4D401F,mp4a.40.2",RESOLUTION=1280x720

https://res.cloudinary.com/dougsillars/video/upload

/c_limit,w_1280,h_720,vc_h264:main:3.1,br_5500k/v1543082006/Campus_mfkz6z.m3u8

#EXT-X-STREAM-INF:BANDWIDTH=972000,CODECS="avc1.4D401F,mp4a.40.2",RESOLUTION=960x540

https://res.cloudinary.com/dougsillars/video/upload

/c_limit,w_960,h_540,vc_h264:main:3.1,br_3500k/v1543082006/Campus_mfkz6z.m3u8

#EXT-X-STREAM-INF:BANDWIDTH=1670000,CODECS="avc1.42C01E,mp4a.40.2",RESOLUTION=640x360

https://res.cloudinary.com/dougsillars/video/upload

/c_limit,w_640,h_360,vc_h264:baseline:3.0,br_2m/v1543082006/Campus_mfkz6z.m3u8

#EXT-X-STREAM-INF:BANDWIDTH=1021000,CODECS="avc1.42C01E,mp4a.40.2",RESOLUTION=480x270

https://res.cloudinary.com/dougsillars/video/upload

/c_limit,w_480,h_270,vc_h264:baseline:3.0,br_800k/v1543082006/Campus_mfkz6z.m3u8

#EXT-X-STREAM-INF:BANDWIDTH=341000,CODECS="avc1.42C01E,mp4a.40.2",RESOLUTION=320x180

https://res.cloudinary.com/dougsillars/video/upload

/c_limit,w_320,h_240,vc_h264:baseline:3.0,br_192k/v1543082006/Campus_mfkz6z.m3u8

In this example, there are five video streams available. Each stream uses two lines to list all of the stream attributes, and then a link to a child manifest file that lists the segments. The first stream in this list is a 1.57 Mbps video with a resolution of 1280×720.

The first stream

The first stream in the HLS manifest is extremely important. That’s because the player has to choose a stream to start the video playback, and typically, the first stream in the list is chosen. The video startup time, and initial quality is thus determined by the stream at the top of the list.

The next request by the player is for the child manifest of this bitrate. This manifest file lists information about each segment of the stream:

#EXTM3U

#EXT-X-VERSION:4

#EXT-X-TARGETDURATION:8

#EXT-X-MEDIA-SEQUENCE:0

#EXTINF:8.341667,

#EXT-X-BYTERANGE:1632216@0

Campus_mfkz6z.ts

#EXTINF:8.341667,

#EXT-X-BYTERANGE:1134956@1632216

Campus_mfkz6z.ts

#EXTINF:8.341667,

#EXT-X-BYTERANGE:1594804@2767172

Campus_mfkz6z.ts

#EXTINF:8.341667,

#EXT-X-BYTERANGE:1193048@4361976

Campus_mfkz6z.ts

#EXTINF:8.341667,

#EXT-X-BYTERANGE:1208464@5555024

Campus_mfkz6z.ts

#EXTINF:8.341667,

#EXT-X-BYTERANGE:1008432@6763488

Campus_mfkz6z.ts

#EXTINF:0.834167,

#EXT-X-BYTERANGE:95504@7771920

Campus_mfkz6z.ts

#EXT-X-ENDLIST

Each segment in this manifest is represented by three rows of data. #EXTINF represents the length of the segment in seconds. EXT-X-BYTERANGE gives the <number of bytes in segment>@<starting bytes of segment> , and the third line is the URL of the segment to be downloaded.

We can monitor this process with WebPageTest:

Webpage HTML hls.js, a JavaScript streaming video player Master M3U8 manifest file 404 — I don’t have a favicon on this site Child M3U8 file ts file — the initial video segment ts file — the initial video segment

Ongoing playback

Once the stream has started, the player can track how quickly the chunks of video are downloaded — estimating the bandwidth of the connection. If the bitrate is higher than the bandwidth, the video will not download fast enough to keep pace with playback, so the player will switch to a lower quality stream that will allow uninterrupted playback.

Expanding the WebPageTest waterfall above: request 8 is for another child m3u8 (for a lower bitrate stream), and the subsequent ts files are downloaded faster, allowing for the video buffer to fill up faster. This indicates that the player decided that a lower bitrate would allow better playback of the video.

Double take on bitrate changes

The player will continue to adjust the downloaded segments to optimise the quality of the video, while ensuring that the video fits underneath the network bandwidth cap. In the waterfall below, there are three child m3u8 files requested:

This allows the player to adjust the video when network conditions vary, or become less than ideal.

Streaming solves a number of tough problems

With this setup, the player and the server communicate back and forth to ensure that the best quality video is displayed to the end user — no matter the device or network quality. As opposed to a single video (perhaps with JavaScript guessing the ideal stream based on initial parameters), we are already delivering a better product to our customers than a static video file. But we can probably do better!

Optimising video streams

There are so many variables that you can adjust when it comes to video streaming — the bitrates of each stream, the order of the streams, the video encoding, the audio profiles, etc. In the context of this short article, we will stick to optimising the video streams for three of the top metrics used to identify good quality streams:

Startup time Stalls Video quality

By focusing on these three metrics you can make a lot of headway towards optimising your videos.

Video startup

According to Conviva (a popular video analytics company), in Q3 2018, 18 percent of mobile and 31 percent of desktop video streams were never consumed. They either failed to start, or the user exited before the video could begin playing.

Why are so many videos failing to play? It turns out that the average video start time was 3.47s on mobile and 6.29s on desktop. The longer a customer has to wait for a video, the more likely they are to abandon the playback.

Setting the initial bitrate

To prevent video playback abandonment, the video needs to start as quickly as possible. For the video to start playing, it must be on the device — which means the faster the first segment (or two) of video can be delivered to the player, the faster the video will start. This can be easily done by modifying the first stream in the manifest to a slower bitrate stream. In the example video above, I modified the initial stream in the manifest to 370 Kbps, 972 Kbps and 1.57 Mbps to see if the video startup time changed significantly. I tested each of these streams with WebPageTest, using a Nexus 5 with LTE network speed:

Each frame represents 0.5s. The 370 Kbps video starts over one second faster than the 972 Kbps, which is 200 to 300 milliseconds faster than the 1.57 Mbps stream.

The effect is more pronounced when testing at 3G:

In this screenshot, each frame is five seconds apart, indicating that the 1 Mbps stream is five seconds faster than 1.57 Mbps, and 370 Kbps is 10 seconds faster. In this test, all three streams downloaded the first segment, and the player fell back to the lowest possible stream. The top row was already there and kept streaming, but the other two streams took longer to download the larger first segment, adding seconds to the video startup time.

Of course, this means that the initial few seconds of video stream may be at a lower quality, so it’s important to test how a change to a lower bitrate affects engagement.

In examining 3,000 video streams with the HTTP Archive in January 2019, 58 percent start with the lowest bandwidth stream to ensure that the video will startup as quickly as possible. Thirty percent of videos have an initial bitrate set above 1.5 Mbps. Since the HTTP Archive tests mobile devices with a throughput of 1.5 Mbps, these sites will have a longer initial delay as the player will have to fall back to a lower quality video before playback can begin.

This is an important cutoff point for video, as US mobile networks (using marketing terms like “Smooth Stream” and Binge On”) throttle video at 1.5 Mbps. Videos that begin with a stream above 1.5 Mbps will not play until the first segment is downloaded, and then the player degrades the video quality, leading to a significant startup delay.

Best practice: Start with an initial bitrate as low as possible to ensure fast video startup.

M3U8 list of streams

For longer videos, there may be 10 seconds or hundreds of segments listed in the m3u8 file. This list is highly repetitive, making these files an ideal case for gzip compression. Below are two examples of m3u8 lists of segments.

The first is not zipped, so the 104 KB file uses 104 KB on the wire.

This page has three m3u8 files with lists of segments, but the files are compressed. Rather than use over 750 KB on the wire, less than 30 KB are used — allowing the files to be delivered faster — and allowing the request for the first segment of video to be requested faster.

Best practice: Always zip text files for faster transfer over the network.

Video playback

Now that the video has started playing, the Adaptive Bitrate player is not done. The next biggest cause of video abandonment is due to the video stalling. This occurs when there is no more video present on the device to play, and the player must wait for further segments to arrive over the network. The number of stalls that occur during playback is the rebuffering ratio. The Conviva report indicates that this occurs on 1.1 percent of mobile plays, and 0.9 percent PC plays. As this number is reduced, the engagement and length of viewing increases. Research has shown that starting with a lower bitrate allows for more segments to be quickly downloaded, and decreases the percentage of stalls.

Some video players aggressively push for higher bitrates, which may lead to more changes in stream. The more often a player changes the stream, the more likely it is that a video will stall. A future post will evaluate various players for their aggressiveness, and how this relates to stalls.

In a sample of 3000 streaming videos from the HTTP Archive, 2.6 percent had just one stream, which negates the possibility to tune the video to the customer’s device or network.

Video quality

As the video player monitors the available video segments and the network speed, it can adjust the download to a lower or higher quality bitrate to ensure continued video playback. When bitrates are chosen correctly, this should ‘just work’, but there are a few tricks to make it work better.

Appropriate bitrates

As noted above, US carriers throttle video to 1.5 Mbps, so it’s important to have at least one stream with a bitrate below 1.3 Mbps, which essentially maximises the quality for all mobile customers in the USA. Examining the data in the HTTP Archive, 1.8 of the videos studied have 0 streams that will play below 1.25 Mbps, and are unlikely to stream on a US mobile phone, while 276 sites have just one available stream to mobile customers in the states, which prevents the video from adapting as needed.

Even stream distribution

In the chart below, I have graphed the available stream bitrates, and highlighted the space between streams 4 and 5.

The change in bitrates for streams 1 to 4 is fairly linear, as are the changes in 5 to 8, meaning that adjustments for the player are straightforward. However, the change from 4 to 5 is a much larger adjustment than any of the others, which makes it hard for the player to make ‘the leap’ to the higher bitrates. Additionally, we can see that the jump goes from 1 Mbps to 3.5 Mbps, which are common streaming rates on mobile. This forces all mobile users to the lower quality range, even if they have a 3 Mbps stream.

Best practice for quality: Have a stream of around 1.3 Mpbs for mobile customers in the USA, and keep stream changes small or evenly distributed.

Conclusion

Mobile streaming is the “responsive” way to deliver video on the web. Rather than one static MP4 file for all customers, the video can be adapted for the screen size and also for the available network conditions. With proper tuning of the video parameters, the ideal video can be delivered to all consumers, improving video startup times, reducing stalls, and optimising the amount of data transferred. We’ve walked through some of the basics of how streaming works, and some best practices that further enhance the quality of streaming video.