webrtcH4cKS: ~ Does your video call have End-to-End Encryption? Probably not..

Time for another opinionated post. This time on… end-to-end encryption (e2ee). Zoom apparently claims it supports e2ee while it can not satisfy that promise. Is WebRTC any better?

Zoom does not have End to End Encryption

Let’s get to the bottom of things fast: Boo Zoom!

I reviewed how Zoom’s implements their web client last year.

I’m not really surprised of their general lack of e2ee given that their web client did not provide any encryption on top of TLS or WebRTC’s DataChannel. For reasons we will discuss below, this means they weren’t doing any obvious e2ee there.

Update (April 2nd): Zoom published a blog post saying are using e2ee in the main use-case. Which sounds great but how is that auditable, how are keys managed and what prevents them from switching it off at any time?

Is WebRTC Any Better?

Now that we’re done with finger pointing, how does the situation look in WebRTC land?

WebRTC is encrypted. By default. You can’t turn it off. It’s clearly secure! Sadly, the situation is a bit more complex.

Encrypting Real Time Media

WebRTC uses DTLS-SRTP for encryption. In a nutshell that means there is a (D)TLS handshake and then the encryption keys are derived from that. That uses self-signed certificates which are signalled in the SDP. This is a fair bit better than no encryption or SDES.

DTLS vs. SDES

The slides from the 2013 IETF meeting in Berlin discuss the topic of DTLS vs SDES in quite some detail and we also have a post on that decision if you want more history there.

There are two things to note here:

DTLS requires an active attack. It is possible (using chrome://webrtc-internals or Firefox about:webrtc) to get hold of the remote DTLS fingerprint of a peer you’re connected to. But that is quite hard for the average user. It is possible to use end-to-end encryption for the signaling messages which then establish a binding between an identity and the fingerprint.

This even applies if your traffic is routed through a TURN server, which by design does not know the encryption keys negotiated via DTLS. It is encrypted to the peer. Now in the multiparty case that peer is often a SFU. The same applies to Zoom. I looked at their native stuff a couple of years back and the payload of the UDP packet seemed pretty random which suggests a similar level of encryption.

Selective Forwarding Units (SFU)

Now there is a thing about SFUs. This is the defacto architecture used to relay media in the cloud when you need to scale a video conference past a few users. They need to do some fancy things with RTCP, the control protocol for media in order to work. Oscar Divorra described the details here and Gustavo and Sergio go into the details of layering here

They also need access to a tiny bit of information about the frame, in particular whether it is a keyframe in order to make simulcast work. You can see some of this here.

This can be solved by a technique called “frame marking” which pulls that bit of information out into an unencrypted header extension. The same goes for server-side speaker detection when it comes to audio.

Note it is a different story for 1:1 calls or calls that employ a peer-to-peer mesh architecture. These do offer e2ee by default – noting the DTLS caveats above.

WebRTC Insertable Streams to the rescue

Unlike an MCU an SFU does not need or want access to the unencrypted media. But they get it because there is no alternative yet. However, this is about to change with the Insertable Streams API that is being implemented by the Chrome WebRTC team right now:

It has been available in the native webrtc.org API for a while but Chrome bindings were missing. It is far from ready and needs considerably more testing. There were some pretty glaring bugs like not working in the other direction (fixed in less than 24 hours which was much appreciated). The bar is rising here but there is still quite some effort to be done before it is ready.

So yes, Zoom does not have end-to-end encryption. Quite often, WebRTC doesn’t either – not yet at least. If you are using a WebRTC service check their terms of service and privacy policy and make sure that you understand what they are saying about this. Hopefully we will see this change soon as WebRTC Insertable Streams matures.

Disclosure: I had a coffee with Eric Yuan, CEO of Zoom in early 2019 after he read (and hopefully enjoyed) the original post on how Zoom avoids WebRTC. He paid for the coffee and gave me nice swag even.

{“author”: “Philipp Hancke“}

Want to keep up on our latest posts? Please click here to subscribe to our mailing list if you have not already. We only email post updates. You can also follow us on twitter at @webrtcHacks for blog updates.