For months now, social media companies have been grappling with how to minimize or eradicate hate speech on their platforms. YouTube has been working to make sure advertisers' content doesn't show up on hateful videos. Instagram is using AI to delete unsavory comments. And earlier this week, ProPublica reported on the internal training materials Facebook gives to the content managers who moderate comments and postings on the platform on how to calculate what is and isn’t hate speech.

According to the report, the rules use a deliberate, if strange, logic in determining how to protect certain classes of people from hate speech while not protecting others. ProPublica points to an example specific from the training materials: Facebook's rules dictate that “white men” is a protected class whereas “black children” are not.

How the Rules Work

According to Facebook’s rules, there are protected categories---like sex, gender identity, race and religious affiliation---and non-protected categories---like social class, occupation, appearance, and age. If speech refers to the former, it’s hate speech; if it’s refers to the latter, it’s not. So, “we should murder all the Muslims” is hate speech. “We should murder all the poor people” is not.

This binary designation might make some uncomfortable, but it’s when protected and unprotected classes get linked together in a sentence---a compound category---that Facebook’s policies become extra strange. Facebook’s logic dictates the following:

Protected category + Protected category = Protected category

Protected category + Unprotected category = Unprotected

To illustrate this, Facebook’s training materials provide three examples—“white men”, “female drivers”, and “black children”—and states that only the first of these is protected from hate speech. The answer is “white men.” Why? Because “white” + “male” = protected class + protected class, and thus, the resulting class of people protected. Counterintuitively, because “black” (a protected class) modifies “children” (not protected), the group is unprotected.

Math + Language = Murky

In math, this kind of logical rule-setting is called symbolic logic, and it has understandable rules. The word-based logic discipline was first created in the nineteenth century by mathematician George Boole, and has since become essential to the development of everything from computer processors to linguistics. But you don’t need to have a PhD in logic or the philosophy of language to recognize when basic rules are being violated. “Where did @facebook’s engineers take their math classes? Members of subset C of set A are still members of A,” tweets Chanda Prescod-Weinstein, an astrophysicist at the University of Washington.

Philosophers of language think a lot about how modifying a category alters the logic of a sentence. Sometimes when you have a word for a category—like white people—and you replace it with a subset of that same category---like white murderers---the inference doesn’t follow. Sometimes it does. For instance, take the phrase “All birds have feathers” and replace it with “All white birds have feathers.” The second sentence still makes logical sense and is a good inference. But take “Some bird likes nectar” and replace it with “Some white bird likes nectar,” that may not be true anymore---maybe only green birds like nectar. It’s a bad inference.

Facebook’s rules appear to assume that whenever a protected category is modified with an unprotected category, the inference is bad. So just because “black people” is a protected class, it explicitly doesn’t follow that “black children” is a protected class, even though the average person looking at that example would say that black children is a subset of black people.

The fact is, there isn’t a way to know systematically whether replacing a category with a subcategory will lead to a good or bad inference. “You have to plug in the different examples,” says Matt Teichman, a philosopher of language at University of Chicago. “You have to just look at the complexity of what’s happening to see for sure.”