But that's really only the beginning. They've amassed the data, which is the hard part. As The Believer's Max Fenton put it, "The pulled off the big heist of making a database of all the humans that isn't explicitly evil. It's a babel, and it's great." But the majority of the data that Facebook has isn't all as easy to parse as counting the +1s and comments to posts.

It's much more difficult to parse the real language that people use to communicate with their friends to determine what's going on in their lives and what that means they want to purchase. But people are flocking to the fields of natural language processing and sentiment analysis, among others, and Facebook will benefit from advances in those and related AI fields.

User-Centered

Perhaps the most highly publicized problem with Facebook's business came when General Motors pulled its advertising from the service, saying Facebook's ads didn't work. Today, AdAge revealed why that happened. GM wanted to take over pages as many companies do on media sites. Facebook wanted to keep the user experience it had built. GM walked and Facebook's users were protected from intrusive advertising. If you're an advertiser, maybe that gives you pause, but if you're a user: HEY, Zuck's looking out for you!

More broadly, we know when Facebook has erred with its users (remember Beacon?). But we don't know how many times they've avoided making mistakes that would have annoyed users. Can you imagine how many pitches they've gotten to increase revenue by placing more invasive ads? GM certainly wasn't the first and definitely won't be the last.

Privacy

Facebook is not exactly known as a sanctuary of privacy. Their business, after all, is selling you ads based on what you say and do. Insofar as you life on Facebook, you live a perfectly observed and recorded life. So I was shocked when I took a look at a recent paper by two scholars in the North Carolina Law Review arguing that Facebook should actually be seen as a model for privacy. Hear this out, though. It's interesting.

University of North Carolina legal scholars Andrew Chin and Anne Klinefelter look at the problem of "reidentification" and how Facebook appears to have solved it. The basic problem is that we say that data has been anonymized if you remove someone's name, but in reality, that's not true. Many different studies have shown that by combining outside data with the output of a database, it's not that difficult to reidentify someone. For example:

Latanya Sweeney, then a graduate student at MIT, merged presumably anonymized Massachusetts state worker hospital records with voter registration records and was able to identify rather quickly the health records of then-Governor William Weld.30 Sweeney later published a broader study finding that 87% of the 1990 U.S. Census population could be indentified using only gender, zip code, and full date of birth, and others reproduced this work in the 2000 Census with 63% success in identifying individuals.

This is a big problem. Big enough that University of Colorado privacy and legal scholar Paul Ohm declared anonymization the core problem with our current world's "broken promises of privacy."