Watching you watch: the tracking ecosystem of over-the-top TV streaming devices, Moghaddam et al., CCS’19

The results from this paper are all too predictable: channels on Over-The-Top (OTT) streaming devices are insecure and riddled with privacy leaks. The authors quantify the scale of the problem, and note that users have even less viable defence mechanisms than they do on web and mobile platforms. When you watch TV, the TV is watching you.

In this paper, we examine the advertising and tracking ecosystems of Over-The-Top ("OTT") streaming devices, which deliver Internet-based video content to traditional TVs/display devices. OTT devices refer to a family of services and devices that either directly connect to a TV (e.g., streaming sticks and boxes) or enable functionality within a TV (e.g. smart TVs) to facilitate the delivery of Internet-based video content.

The study focuses on Roku and Amazon Fire TV, which together account for between 59% and 65% of the global market. The top 1000 channels from each service are analysed using a custom-built crawling engine, and traffic is intercepted where possible using mitmproxy .

How they did it

For each service, a list of the top 1000 channels was compiled, as well as the top 100 channels across the most popular categories. Since there was no off-the-shelf crawling infrastructure for OTT devices, the authors then had to build their own.

The desktop machine acts as WiFi access point and receives both the audio and video signals emitted from the device. This enables monitoring of the device outputs (audio, video, and network). The devices are controlled using their remote control functionality: web apis for Roku, and adb for Amazon Fire TV.

Where possible the authors use mitmproxy to intercept traffic from the devices. Some channels use certificate pinning. On the Roku device interception is disabled for those channels. On Amazon Fire TV there is access to the device which enabled the authors to bypass channel-level certificate pinning using a Frida toolkit script. Details are in appendix B of the paper.

The following table summarises the various crawls that were undertaken. The ‘MITM’ suffix indicates that the mitmproxy was used. Both Roku TV and Amazon Fire TV have privacy options (‘Limit Ad Tracking’, and ‘Disable Interest-based Ads’ respectively) and a crawl was also done with these options enabled to see what difference they made.

What they found

Trackers are everywhere! On Roku TV, the most prevalent tracker is for Google’s doubleclick.net (975/1000 channels). On Amazon Fire TV it is amazon-adsystem.com (687/1000). Facebook is notably less present on TV than it is in mobile and web channels.

The amount of tracking varies by channel category, as the following plots reveal:

There are plenty of different device and user identifiers that can be used for tracking. The study checked for leaks of the following:

On Roku, we discovered that 4,452 of the 6,142 (nearly 73%) requests containing one of the two unique DIs (AD ID, Serial Number) are flagged as trackers. On Amazon, 3,427 of the 8,433 (41%) unique identifiers are sent in cleartext.

Nine of the top 100 channels on Roku, and 14 of the top 100 channels on Amazon Fire TV leak the title of each video watched to a tracking domain. The Roku channels leaked this information over unencrypted connections.

79% of Roku channels send at least one request in cleartext, and 76% of Fire TV channels. That unsecured information is flowing to the following destinations:

What you can do about it

Turning on the in-device privacy options doesn’t really help much. When you ‘Limit Ad tracking’ on Roku the AD ID no longer leaks, but the number of contacted trackers stays the same, as does the number of Serial Number leaks. That is, channels are obeying the letter of the law (limit ad tracking means no AD ID) but not the spirit.

It’s a similar story on Fire TV:

Our data… reveals that even when the privacy option is enabled, there are a number of other identifiers that can be used to track users, bypassing the privacy protection built into these platforms.

Running with a Pi-hole helps, but still misses about 27% of A ID leaks, and 45% of serial number leaks.

The last word