ABSTRACT

When social networking sites give users granular control over their privacy settings, the result is that some content across the site is public and some is not. How might this content--or characteristics of users who post publicly versus to a limited audience--be different? If these differences exist, research studies of public content could potentially be introducing systematic bias. Via Mechanical Turk, we asked 1,815 Facebook users to share recent posts. Using qualitative coding and quantitative measures, we characterize and categorize the nature of the content. Using machine learning techniques, we analyze patterns of choices for privacy settings. Contrary to expectations, we find that content type is not a significant predictor of privacy setting; however, some demographics such as gender and age are predictive. Additionally, with consent of participants, we provide a dataset of nearly 9,000 public and non-public Facebook posts.