Photo by Rachit Tank via Unsplash

At a glance

During his Congressional hearing, Mark Zuckerberg claimed that Facebook doesn’t have access to WhatsApp chats thanks to end-to-end encryption.

Nevertheless, communication channels between the WhatsApp and Facebook iOS apps could be abused to leak data from the entire chat history.

The object of this article is not to claim that Facebook snoops on WhatsApp chats, but that end-to-end encryption is used by both WhatsApp and Facebook as a disingenuous and misleading argument to reassure the public.

Context

In August 2016, WhatsApp announced in a blog post that it would begin sharing limited amounts of data with its parent company Facebook. At the time, end-to-end encryption was put forward as a strong privacy safeguard:

“We’re also updating these documents to make clear that we’ve rolled out end-to-end encryption. When you and the people you message are using the latest version of WhatsApp, your messages are encrypted by default, which means you’re the only people who can read them. Even as we coordinate more with Facebook in the months ahead, your encrypted messages stay private and no one else can read them. Not WhatsApp, not Facebook, nor anyone else.”

The language is clear: there’s nothing to fear! End-to-end encryption prevents Facebook from snooping on your chats. And that’s exactly how media outlets understood it at the time. WhatsApp’s Legal page was updated simultaneously, and features very similar language:

“Your messages are yours, and we can’t read them. We’ve built privacy, end-to-end encryption, and other security features into WhatsApp. We don’t store your messages once they’ve been delivered. When they are end-to-end encrypted, we and third parties can’t read them.”

Worth noting: the previous paragraph is titled “We joined Facebook in 2014”.

Let’s jump to Mark Zuckerberg’s Congressional hearing (WSJ’s transcript) a couple of days ago; it’s clear that this rhetoric has not changed and is shared by both WhatsApp and Facebook:

SCHATZ: Let me — let me try a couple of specific examples. If I’m email — if I’m mailing — emailing within WhatsApp, does that ever inform your advertisers? ZUCKERBERG: No, we don’t see any of the content in WhatsApp, it’s fully encrypted.

Later, responding to Young:

ZUCKERBERG: (…) That’s how WhatsApp works too, so that’s an app. It’s a very lightweight app. It doesn’t require us to know a lot of information about you, so we can offer that with full encryption, and therefore, we’re not looking — we don’t see the content.

Emphasis on therefore mine to underscore how causality is strongly implied between encryption and the impossibility for Facebook to access your chats.

But it’s just not true. Facebook could potentially access your WhatsApp chats. In fact, it could easily access your entire chat history and every single attachment. I’m not saying it does, and I have no evidence suggesting that it ever has. But as Android users have recently been finding out that their call history and SMS data had been collected by Facebook, I believe it is important to examine the means by which Facebook is already in a position to collect our WhatsApp data, from any iPhone running iOS 8 and above.

Porous Sandboxing

In its first iterations, the iOS file system was strictly sandboxed: apps could only access files in their own container, greatly increasing security and privacy. But this pro-privacy choice of the Jobs era came with significant caveats: you couldn’t, for instance, record audio in one app and edit it in another. Or work on a Pages document and then upload it to an FTP server with a file manager app. Some clunky workarounds existed, but it became increasingly clear that strict sandboxing was getting in the way of productivity.

Adoption of iOS in more professional settings may also have been poor because of these restrictions.

With iOS 8, Apple introduced extensions, tiny apps embedded in their parent app, which could perform specific tasks like sharing a document or pushing content to Apple Watch. Apps and their extensions are allowed to share files placed in a special container, dubbed shared container. In addition, App Groups were introduced: a developer could now register all of their apps in the same App Group, and set up a shared container to enable apps of the same group to share assets and documents. Here’s Apple’s documentation on shared containers:

At some point after acquiring WhatsApp, Facebook registered it as part of the same App Group as the Facebook Messenger and Facebook apps. We’re not sure when exactly they did this, but most probably around August 2016 after the data share announcement. More importantly, Facebook and WhatsApp now had a privileged way to share information across traditional sandboxing boundaries, via a shared container named group.com.facebook.family.

We know this because in order for iMazing to back up and restore apps selectively, we had to understand which shared container belonged to which apps and package those containers too. When we figured out how to do that, we decided to expose those shared containers in iMazing’s backup file browser:

Facebook’s “family” shared container, accessible by Messenger

Facebook’s “family” shared container, accessible by WhatsApp

Aren’t WhatsApp chats encrypted anyway?

It’s complicated. Messages are encrypted when you send them, yes. But the database that stores your chats on your iPhone does not benefit from an extra layer of encryption. It is protected by standard iOS data protection, which decrypts files on the fly when needed. Here’s said database, extracted from my iPhone’s backup with iMazing:

ChatStorage.sqlite stores all messages and metadata displayed in WhatsApp

No extra encryption. Timestamps, text, from and to names, phone numbers, paths to attachments; it’s all there, enough to rebuild your entire chat history.

And the kicker: it would take a good iOS developer just a few days to put in place code in both the Facebook and WhatsApp apps that could discretely copy this database from one app to the other, via their shared container.

Again, I am not claiming that this is happening, nor that it ever happened. But the tools are there.

And when Mark Zuckerberg declares before the U.S. Congress that “It’s all encrypted”, he’s ignorant at best. I’d put my money on the good old misleading half truth, perpetuated since WhatsApp’s 2016 change in its data sharing policies.

Hope?

Facebook is under such scrutiny that if it were to do something as extreme as collecting WhatsApp chats, it would be quickly caught by security researchers.

But would it really? Everyone missed the fact that they were collecting SMS data from Android users for over a year. And in order to research how iOS apps handle user data, one needs a jailbroken device — what happens if no jailbreak is available anymore? Do we just trust Facebook and stop looking?

[Edit: After consulting with our team, I decided to remove a section about Crossy Road data ending up in Facebook’s shared container. A doubt subsists regarding the validity of the data: tests I made in late 2017 on a previous device could conceivably have caused this and carried over through backup restore to a newer device.]

Timeline