Facebook's tendency to leak private information and photos has gotten the company in hot water, but the controversies may be missing a larger point. The service is all about sharing personal tastes and interests in a public forum, and the collective public musings can tell you a lot about the service's users.

How much? A new study has paired a personality profile with a datamining of people's "likes" on Facebook and has found that the likes collectively tell us some remarkably specific things about political views, personality traits, happiness, drug use, and so on. On its own, the study doesn't tell us anything shocking, but it provides some amusement value when the authors dive into their numbers and find out what items were specifically correlated with what traits. Which is how we find out that fans of curly fries probably outscored Sephora users on their SATs.

The work comes out of the myPersonality project, which has created a Facebook app that gives users a basic test of their psychological traits. If the users permit it, the researchers also get access to their Facebook profile and history of using the service to formally "like" something. At the time that the analysis in this study was done, the authors had data on nearly 60,000 users.

The study was remarkably simple: take a list of all the users' likes and start doing regressions to see if they could be correlated with a variety of specific traits, from basic demographic information like age and sexual orientation to personality traits and drug use. The end result was a score that reflected a simple test of accuracy: given two random members of the group that are on opposite sides of a score (say, gay and straight), how well could the algorithm do at predicting both correctly?

For the gay or straight question, remarkably well—88 percent of the time, it would get them right (it did worse with lesbians). That's about the same as its prediction of political persuasion (Democrat vs. Republican) and religious persuasion (they only compared Christianity and Islam). The statistics got gender right 93% of the time, and they picked Caucasians and African Americans with 95 percent accuracy. From there, things dropped a bit; cigarette and alcohol use were down to around 70 percent accuracy, while the study got drug use and "being in a relationship" correct only about two-thirds of the time.

People use the "like" button with very different frequencies, so the authors tracked whether their predictions got better when a user was a bit twitchier with the mouse. For every value they checked, accuracy went up with the number of likes available to analyze, although the accuracy for predicting age tailed off at about 300 likes.

The authors also tested whether likes could be used to predict some personality traits that had been scored by their tests. There's some variability if a person takes the same test twice—their scores won't typically have a perfect 1.0 (full correlation) match, but rather a correlation in the 0.6 to 0.8 range for most tests. So they compared their like-based predictions to the inter-test variation. For the most part, the likes didn't do especially well, although they did manage to do much better than random chance. Intelligence and extroversion each have a between-test correlation of about 0.75; the like-based scores only correlated with the tests a bit more than half that. The one exception was openness; its inter-test correlation is 0.55, and the like-based predictions managed to capture about 80 percent of that.

The authors make some valid points about how we tend to focus on the leaking of specific information, like social security numbers and embarrassing pictures. But the more interactions we have in public forums like Facebook, the more details of our lives tend to leak. In some countries, details like sexual orientation or religion can put people's lives at risk.

That sober warning aside, the details of some of the correlations they found are simply hilarious. When they looked at the best predictors of high intelligence, they came up with science (naturally), thunderstorms (oddly), The Colbert Report (OK, I can kind of see it) and curly fries (huh?). On the low end of the intelligence scale, you wind up with Sepahora-using, Harley-riding, Lady Antebellum fans who "like being a mom."

Self-identified gay users were unlikely to spend the time liking obviously gay-friendly groups like the No H8 Campaign or Gay Marriage, so the authors had to sharpen their predictions using things such as Wicked The Musical, Britney Spears, and Desperate Housewives. Straight users were picked out based on liking the Wu-Tang Clan, Shaq, and (bizarrely) “Being Confused After Waking Up From Naps.”

Some individual likes also said a lot about a person. For example, the authors found that Hello Kitty fans tended to score high on openness but lower on things like conscientiousness, agreeableness, and emotional stability. They also tended to be Democrats, for what it's worth.

PNAS, 2013. DOI: 10.1073/pnas.1218772110 (About DOIs).