About

Facebook is a great service. I have a profile, and so does nearly everyone I know under the age of 60.

However, Facebook hasn't always managed its users' data well. In the beginning, it restricted the visibility of a user's personal information to just their friends and their "network" (college or school). Over the past couple of years, the default privacy settings for a Facebook user's personal information have become more and more permissive. They've also changed how your personal information is classified several times, sometimes in a manner that has been confusing for their users. This has largely been part of Facebook's effort to correlate, publish, and monetize their social graph: a massive database of entities and links that covers everything from where you live to the movies you like and the people you trust.

This blog post by Kurt Opsahl at the the EFF gives a brief timeline of Facebook's Terms of Service changes through April of 2010. It's a great overview, but I was a little disappointed it wasn't an actual timeline: hence my initial inspiration for this infographic.

Let me be clear about something: I like Facebook. It's helped me reconnect with dozens of people with whom I'd lost touch, and I admire the work their team does. I hope your takeaway from this infographic isn't "I'm deleting my account"; rather, I hope it's "I'm checking my privacy settings right now, and changing them to a level with which I'm comfortable".

Data

The data for this chart was derived from my interpretation of the Facebook Terms of Service over the years, along with my personal memories of the default privacy settings for different classes of personal data. The population sizes are statistics from Google, the Facebook Data Team, and wild guesses based on what seemed reasonable to me.

I welcome data corrections, so please leave a comment below if you have better numbers to share.

Types of Personal Data

Facebook's classification system for personal data has changed significantly over the years. I tried to capture what I thought were broad topics that have remained relatively consistent. But they might need some explanation.

Likes : a person, band, movie, web page, or any other entity represented in Facebook's social graph that has a "like" button. "Likes" started with status updates, but have now grown to encompass pretty much everything. In Facebook Newspeak, they're a "Connection".

: a person, band, movie, web page, or any other entity represented in Facebook's social graph that has a "like" button. "Likes" started with status updates, but have now grown to encompass pretty much everything. In Facebook Newspeak, they're a "Connection". Name, Picture, Gender, Birthday, Contact Info : self-explanatory

: self-explanatory Extended Profile Data : Your family members, city, place of birth, religious views, favorite authors, schools attended -- anything that is an entity you can list a relationship to in your profile.

: Your family members, city, place of birth, religious views, favorite authors, schools attended -- anything that is an entity you can list a relationship to in your profile. Friends : The people you've friended

: The people you've friended Networks : The personal networks you've set up on Facebook (e.g. colleges & universities or companies).

: The personal networks you've set up on Facebook (e.g. colleges & universities or companies). Wall posts & Photos: Self-explanatory.

Audiences

Audience sizes are based on averages, interpolations of those averages across time, and guesses from my personal experience where that data was unavailable.

One thing you may notice is that by 2009, the term "Network" for the inner circle is replaced by "FoF", or "Friends of Friends". Facebook introduced this in 2008 to cater to users whose networks were too large to be manageable. My guess is that this effectively shrank the potential number of people who could see this particular kind of data. I ballparked an estimate for the average size of this extended friend network by taking the average number of friends a user had in 2009 (130) and assuming there was on average a 2/3rds overlap with each of their friends, yielding an average of 8450 people.

Implementation

The audience scale is logarithmic, so that we can compare audience sizes of 100 and 1 billion. I also did a big no-no and mapped the audience size to the length of the slice, not its area. I don't feel too terrible about this, because the area comparison is already distorted by the log scale. Plus, frankly, the linear scale just looks better.

I built this sketch using Processing.js. You're welcome to download the source. Sorry, no Internet Explorer.

About me

My name's Matt McKeon. I'm a developer with the Visual Communication Lab at IBM Research's Center for Social Software. The views expressed here are my own, and do not reflect those of IBM. You can find me on Twitter and (hah) Facebook.