I’ve been writing a lot about privacy in social networks, and sometimes the immediacy gets lost during the more theoretical debates. Recently though I’ve been investigating a massive privacy breach on Facebook’s application platform which serves as a sobering case study. Even to me, the extent of unauthorised data flow I found and the cold economic motivations keeping it going were surprising. Facebook’s application platform remains a disaster from a privacy standpoint, dampening one of the more compelling features of the network.

It started with an email from Henrik, a LBT reader who was concerned when he noticed his Facebook profile picture, as well as those of his friends, being included in advertisements within the site. These ads showed up next to a third-party Facebook application called “What Eukaryotic Organelle Are You?“, one of the many inane user-generated quizzes which are popular on the site. I followed the tip and tried it myself, and sure enough saw an ad telling me to log-in and see which of my friends had secret crushes on me, alongside photos and names of a few Facebook friends of mine. I went on to find dozens of such ads on similar quizzes.

Examining the page’s source, these ads were being served in an iframe within the third-party application’s canvas, with a quite suspicious URLs like:

http://sochr.com/i.php

&name=[Joseph Bonneau]&nx=[My User ID]&age=[My DOB]&gender=[My Gender]&pic=[My Photo URL]

&fname0=[Friend #1 Name 1]&fname1=[Friend #2 Name]&fname2=[Friend #3 Name]&fname3=[Friend #4 Name]

&fpic0=[Friend #1 Photo URL]&fpic0=[Friend #2 Photo URL]&fpic0=[Friend #3 Photo URL]&fpic0=[Friend #4 Photo URL]

&fb_session_params=[All of the quiz application’s session parameters]

Three data leaks are happening here. The first (light orange) is bad, but expectable, as my personal data is getting sent to the ad server. The second (dark orange) is worse-4 of my friend’s names and photos are getting sent along to the ad server. Note the data-flow here: the third-party quiz application is querying Facebook for my friends’ data, then including it within the URL of the requested ads so the ad server gets the data too. Data is passed in the URL because this is the only way to communicate with content in an iframe. When I refreshed the page, a different set of friends’ information was passed, so the ad-server can slowly build up a database of user information.

The third bit (red) is the amazing part-all of my session parameters were sent to the ad server as well. Because I authorised the application, these parameters are a capability to query Facebook for my data and my friends’ data. Sure enough, the ad server then went on to query Facebook’s databases on my behalf! From its iframe, the ad-server’s JavaScript sends queries to Facebook with the application’s session parameters. The results are then sent back to the ad-server. You can watch all of this happen using a packet sniffer in real-time and it’s quite amazing. There’s a great writeup from a week ago by another security researcher investigating these matters with some example queries the ad server’s JavaScript is making, requesting things like the set of all friends who live in the same city, are single, and share interests with me.

This is all in direct violation of Facebook’s Terms of Service and Platform Guidelines, which clearly prohibit using user data for anything but the application it was given for as well as transferring session parameters to a third party. Yet this violation is occurring on an epic scale. This quiz was created by an application called QuizMonster which allows users to create their own quizzes. It’s incredibly popular, with almost 1 million users having created a quiz, meaning tens of millions have possibly taken a quiz and been subject to these ads. Many other applications no doubt used these ad severs as well.

The ads are mostly served from two domains, SocialHour and SocialReach, which both have websites claiming to be leading “social monetization platforms.” (SocialReach is currently down, but is still cached by Google). They seem quite dodgy though. Their domains are registered through anonymous DNS registrars, and the ads themselves lead to a scam: after taking a quiz users must enter their mobile number, and are later hit with surprise $20 per month subscription fee.

This hints at the root of the problem: it’s tough to make money even as a popular Facebook application, and so developers have been forced to resort to these sleazy ads because they pay well. SocialReach even promises on its web site to pay the highest CPM in the industry. Facebook similarly has an economic incentive to look the other way. They clearly state in their terms of service that they may fail to enforce some of the terms, so users have no recourse if Facebook is loose with the rules to keep the platform attractive. Most damningly, Facebook recently verified applications showing ads like these.

This scheme may be a violation of Principle 7 of the UK Data Protection Act, as Facebook is transferring data to a data processor on its behalf who is not upholding Facebook’s privacy policy, but it’s unclear where the liability lies. Unless users are complaining en masse, Facebook has little reason to police the platform, as they have crafted their terms and conditions to disclaim all liability.

Speaking of complaints, after communicating with Henrik he filed a complaint with Facebook through their standard interface over three weeks ago, clearly stating that his data was being sent places he hadn’t authorised. He received no response, but last night Facebook acted, most likely with threats of legal action, and both Social Reach and Social Hour have stopped showing ads on Facebook. Evidently Facebook received multiple complaints, probably more about the deceptive mobile phone subscription than the data theft, but it was enough to get Facebook to move.

While it’s a positive sign that Facebook eventually did step in, this had been going on for at least a month and millions of users’ data has probably already passed through these ad servers. The long-term problem is that the underlying security model is completely broken. Facebook applications get access to all data of users who sign up, though users sign up for dozens of one-time use applications like these quizzes without thinking twice. There are hundreds of applications springing up every day, and Facebook’s model of implementing no technical sandboxing and policing applications when things go wrong is completely unscalable.

This isn’t a new problem-running untrusted applications with limited privileges has been a research topic in operating systems, browsers, and mobile phones for years. Mobile phones may be the closest parallel. The iPhone’s application dynamics are quite similar to Facebook’s but have been (mostly) successfully run at reduced privileges without issue. A nice research paper last year pointed out that it would be easy to isolate social networking applications from ever seeing user data, as most of them (like quizzes) don’t even need it.

Given the huge body of applications already out there, Facebook is probably stuck with its current model of user consent and legal policing with little real security. For all we complain about abstract notions of privacy, this technical shortcoming will probably be the biggest privacy headache for the foreseeable future.

Thanks to Henrik, Richard Clayton, and Ben Edelman for help with this report.