This all started with the title's simple question. I was aware that the desktop app communicated with the phone after the initial handshake (that part where you have to scan a QrCode so that it knows it's you and not someone else trying to read your conversations) but I wanted to know exactly what information was being sent around.

Disclaimer: I wanted to send this to facebook first, but to report something you need a facebook account, which I don’t have. Then I though… it’s not worth reporting, they must know about it and deemed it worth ignoring.

I initially started looking into the xhr calls being sent/received but there was not much there , I lost interest and forgot about it. Now, a couple of months later, starting to be bored of having free time after I quit my job and with my girlfriend travelling on the other side of the planet the interest sort of came back. How exactly are they passing the data, and what data is easily readable and modifiable from the desktop site.

NOTE: If you have the right privacy settings, this will not allow someone else to see your picture or last seen. That's why I say "freely accessible data" and not something like "gain access to private data".

There's an amazing white paper on the security used to encrypt/decrypt all the steps of the process available from the creators themselves and the terminology goes way over my head but it's there if you want to read through it.

You said something about the battery

The battery was what initially caught my attention, because when you are low of battery you get the warning on your desktop/web client telling you the device is nearing its bedtime.

The message that knew too much

When I found the source of the information, this is not encrypted since it's Whatsapp communicating with itself I guess, I found quite a lot more than I expected. The desktop client knows the make and model of the device, whether it's plugged in or not, the operating system version it's running, the build, and a few more things, like tokens and your id, which apparently follows the prefix+phone@c.us format.

There isn't much more for me to do with this information so I started looking into the other plain text messages being sent and received.

How can we have fun

First I decided to play a little with modifying data that represented something I understood, such as a phone number, or a status, and seeing how it affected my desktop app. While you can obviously modify the server response, what’s the fun in that? The first thing I achieved was fetching someone’s profile picture without adding them to your address book, if we intercept an outgoing call for a profilePicThumb and change the number we will get a response from the server that will contain the url of that user’s profile image.

In this case the user is a Spanish phone number with the 34 prefix. Note that I can’t guarantee everybody has the same suffix after the @ . The first sections of the outbound message seem to be ignored with one of them being an auto-incrementing message count and the other one sometimes a timestamp, sometimes … I don’t know what the 554 is.

But, everything was supposed to be encrypted!

Well, peer to peer messages are encrypted, group messages are encrypted, but there are still some plain text queries from the device that are human readable.

An interesting request made via Websocket is the resumePic call, which sends an array of ids for which you’d like to get their pictures. The problem is that you can add anyone to your phonebook, and Whatsapp doesn’t even check their in your phonebook because you could be the one receiving the message, or be part of a group with people you haven’t added but still want to see their image.

While this may initially seem harmless, the more I think about it the more concerned I am. I am able to get information regarding people I surely haven’t even added to my phonebook. If I only get the id back, I’m guessing that number has no linked account. However, the tag is a timestamp for the last seen time we see below the contact’s name in the chat screen.

I can see this being used by spammers to gather the numbers of people and, let me be paranoid, getting some juicy information. Monitoring the status, last seen time, and picture of a wide number of users programmatically could yield profits for someone that knows what to do with that information.

Is there a limit to the number of requests you can make to the server in terms of numbers? Let’s try with something stupid, I’m feeling generous considering nobody bloody reads this anyway.

Open the browser console, create an array with a hundred numbers and export it to a JSON string to play in Burp a little bit more. The resulting array can be plugged into the previous call syntax.

I was a bit scared that the engineers of WhatsApp might have prevented this and would ban me for abusing their service, luckily for me they seem to have some throttle system in place, since the app returned only 26 results, none of them invalid. Further tests with different numbers reached up to 35 results. All of them with their direct link to the profile image and their last seen time, all of them people I have no clue about. There are plenty of dog pictures though, which is nice.

Wait! Why 26, then 35? I didn't know!. I knew the 9 difference was because I didn’t make sure the number had enough digits, but that still didn’t explain why 35 and not 91.

So I run the call again, with one thousand numbers