We spend a lot of time thinking about what to post on Facebook. Should you argue that political point your high school friend made? Do your friends really want to see yet another photo of your cat (or baby)? Most of us have, at one time or another, started writing something and then, probably wisely, changed our minds.

Unfortunately, the code that powers Facebook still knows what you typed—even if you decide not to publish it. It turns out that the things you explicitly choose not to share aren't entirely private.

Facebook calls these unposted thoughts "self-censorship," and insights into how it collects these nonposts can be found in a recent paper written by two Facebookers. Sauvik Das, a Ph.D. student at Carnegie Mellon and summer software engineer intern at Facebook, and Adam Kramer, a Facebook data scientist, have put online an article presenting their study of the self-censorship behavior collected from 5 million English-speaking Facebook users. It reveals a lot about how Facebook monitors our unshared thoughts and what it thinks about them.

The study examined aborted status updates, posts on other people's timelines, and comments on others' posts. To collect the text you type, Facebook sends code to your browser. That code automatically analyzes what you type into any text box and reports metadata back to Facebook.

Storing text as you type isn't uncommon on other websites. For example, if you use Gmail, your draft messages are automatically saved as you type them. Even if you close the browser without saving, you can usually find a (nearly) complete copy of the email you were typing in your Drafts folder. Facebook is using essentially the same technology here. The difference is that Google is saving your messages to help you. Facebook users don't expect their unposted thoughts to be collected, nor do they benefit from it.

It is not clear to the average reader how this data collection is covered by Facebook's privacy policy. In Facebook’s Data Use Policy, under a section called "Information we receive and how it is used," it’s made clear that the company collects information you choose to share or when you "view or otherwise interact with things.” But nothing suggests that it collects content you explicitly don’t share. Typing and deleting text in a box could be considered a type of interaction, but I suspect very few of us would expect that data to be saved. When I reached out to Facebook, a representative told me that the company believes this self-censorship is a type of interaction covered by the policy.

In their article, Das and Kramer claim to only send back information to Facebook that indicates whether you self-censored, not what you typed. The Facebook rep I spoke with agreed that the company isn’t collecting the text of self-censored posts. But it’s certainly technologically possible, and it’s clear that Facebook is interested in the content of your self-censored posts. Das and Kramer’s article closes with the following: "we have arrived at a better understanding of how and where self-censorship manifests on social media; next, we will need to better understand what and why." This implies that Facebook wants to know what you are typing in order to understand it. The same code Facebook uses to check for self-censorship can tell the company what you typed, so the technology exists to collect that data it wants right now.

It is easy to connect this to all the recent news about NSA surveillance. On the surface, it's similar enough. An organization is collecting metadata—that is, everything but the content of a communication—and analyzing it to understand people's behavior. However, there are some important differences. While it may be uncomfortable that the NSA has access to our private communications, the agency is are monitoring things we have actually put online. Facebook, on the other hand, is analyzing thoughts that we have intentionally chosen not to share.

This may be closer to the recent revelation that the FBI can turn on a computer's webcam without activating the indicator light to monitor criminals. People surveilled through their computers’ cameras aren’t choosing to share video of themselves, just as people who self-censor on Facebook aren’t choosing to share their thoughts. The difference is that the FBI needs a warrant but Facebook can proceed without permission from anyone.

Why does Facebook care anyway? Das and Kramer argue that self-censorship can be bad because it withholds valuable information. If someone chooses not to post, they claim, "[Facebook] loses value from the lack of content generation." After all, Facebook shows you ads based on what you post. Furthermore, they argue that it’s not fair if someone decides not to post because he doesn't want to spam his hundreds of friends—a few people could be interested in the message. "Consider, for example, the college student who wants to promote a social event for a special interest group, but does not for fear of spamming his other friends—some of who may, in fact, appreciate his efforts,” they write.

This paternalistic view isn’t abstract. Facebook studies this because the more its engineers understand about self-censorship, the more precisely they can fine-tune their system to minimize self-censorship’s prevalence. This goal—designing Facebook to decrease self-censorship—is explicit in the paper.