This is the second post in a series about the progress and achievements of our video delivery platform. It will focus on detailing the problems we solved on our live video streaming platform. Our first post was Part One: Our On-Demand Video Platform.

When we first started broadcasting live streaming events at The New York Times, Flash was still a thing. We used proprietary protocols and components from the signal reception to the delivery. At the of end 2015, we decided to remove Flash components from our video player and switch to HTTP Live Streaming (HLS) as our main protocol.

During the 2016 presidential election cycle, the newsroom expressed interest in doing more live events, including the coverage of live debates on the homepage and live blogs. With a mindset of making live events easier and more affordable for the company in the long run, the video technology department decided to invest more in the infrastructure and bring the signal reception and streaming packaging in-house.

We upgraded our on-premises video recording and streaming appliances to a multichannel GPU-accelerated server. With this physical all-in-one solution in place we had more flexibility to set up and broadcast live events, including streaming and serving our content to partners such as YouTube and Facebook.

Illustration by Jason Fujikuni

Our Live Infrastructure: How We Receive and Deliver our Streams

Most of our live content is produced by third parties on location and sent to us at broadcast quality. For example, if we want to stream a press conference from the White House, we make use of our subscription to the network television pool feed. Depending on the event, the feed may be a single camera angle or a more polished “line cut” switching between multiple angles.

The feed is delivered via a point of presence local to the event, which we can route to New York City. This can be expensive, since it is a dedicated circuit transmitting uncompressed HD-SDI signals, but it gives us the highest quality and most flexibility. This way, we avoid the delay and image degradation from extra compression/decompression passes and the potential IP packet loss of transmission over the public internet. The signal is then delivered to our main building via one of our dedicated video fiber lines.

Once in our building, the signal is sent to our live encoding server. Our current solution can simultaneously encode up to eight HD-SDI feeds and distribute to any format needed.

For the live streaming events that we broadcast on nytimes.com, we use our live encoding server to generate six different HTTP Live Streaming (HLS) outputs from the incoming feed. The outputs are composed of H.264/MPEG-TS segments of 3 seconds length in a range of bitrates and resolutions. This allows our video player to adapt and select the best output based on a user’s current connection and device capabilities. We also set the creation of the manifest (M3U8) files of each output in appending mode, where the manifest aggregates all video segments from the beginning to the end of the transmission. This is preferable to rolling mode, since it enables us to do an extremely fast switch on the video asset from live to video on-demand (VoD) once the event is over.

From the same input used on the live streaming events, we also generate an Apple ProRes QuickTime output on our shared Storage Area Network (SAN), which our video editors use for chase editing during the live event or for cuts after.

The Problems We Faced Delivering and Managing Live Events

As shown in the picture below, after our Live Encoding Server generated each individual MPEG-TS segment, they would then be sent individually over the open internet to our CDN. Each packet was sent using HTTP, and our NetStorage/CDN was responsible for hosting, caching and serving the assets.

Live Streaming Stack used for the 2016 Election

The transmissions of the election debates and related events went well, but the number of requests to the CDN proved to be a problem: some players were getting stuck buffering as the M3U8 manifests were taking too long to be updated with new segments. We investigated the cause, digging into the live encoding server logs, and began to notice a pattern of errors when pushing segments over the open internet via HTTP PUT. Even after setting automatic retries, since we shared the same internet connection for operations and development in our office, we found that the live feed was competing for bandwidth with other network activities. We needed a more robust approach if we wanted to continue streaming live events.

Another major pain point was the need to have technical staff in-house during live events setting up the feeds and making sure we were live streaming to the right endpoint and saving the Apple ProRes QuickTime version on the right path. The process was mostly manual and very stressful. The truth was, we needed a simpler and more resilient way to create, start, and monitor live events. It had to be easy enough that someone with very little technical knowledge could intuitively manage our events.

How We Solved Our Delivery and Managing Challenges

The delivery challenge we faced was due to the very nature of the HTTP protocol over the open internet. After discussing possible solutions with our networking team, we realized our best bet was to avoid the internet all together for delivery to our CDN. Instead, we decided to leverage our Direct Connect with Amazon Web Services, solving our bandwidth and latency issues.

Direct Connect is essentially a dedicated network connection that allows for consistent performance between our building and Amazon Web Services. Unfortunately, we didn’t have it enabled for S3 (where we wanted to store our video segments), but we took an approach of proxying via EC2 (which was enabled for Direct Connect). Since our connection to our EC2 proxy was dedicated, and the same for EC2 to S3, we eliminated the chances of packet loss over HTTP. As a bonus, we open-sourced the service that we created and called it the s3-upload-proxy.

After the segments were available on S3, we configured a caching layer for spreading the content around the world.

Final Live Streaming Stack integrated with Live Manager and Feeding Partners

Our Live Streaming Manager: A User-Friendly Tool to Manage Live Events

Our second challenge was the need for technical staff to be on-site and available during each of our live events. In order to avoid this, and reduce the number of error-prone manual steps involved, we decided to implement a Live Streaming Manager. It consists of a web application that is responsible for talking to our live encoding server and coordinating with other components through REST API calls. Although the entire application hasn’t been open-sourced, we did make available our elemental-live-client, which allows for easy interactions with the encoding server we use.