For many, encountering hate on the Internet has become a routine part of the online experience. According to Pew Research Center, 41% of American adults have experienced online harassment, and 66% have witnessed it.1 For those on the receiving end of online vitriol and bigotry, there is no mistaking what is happening: these are words that wound, which are often defined by recipients as hate speech. But defining what constitutes online hate speech can raise many questions. With only words on a screen, and no context about the speaker or the speaker’s actions, can we create generally applicable rules and definitions that will include hate speech, while excluding speech that may sound similar but is not hateful, like news articles, song lyrics, or satire? Or, is hate speech something that you simply recognize when you see it?

What if we could use rules, tests, and parameters to isolate hate speech? Can we identify and analyze elements like speaker intent, context, identity, tone, audience, or any number of indicators that transform words into meanings and change an innocuous statement into a verbal assault?

Combating the proliferation of online hate speech and understanding its mechanics is a complex undertaking. We believe, however, that it can be done. And one way we are working to do so is by teaching machines to recognize hate.

The Online Hate Index (OHI), a joint initiative of ADL’s Center for Technology and Society and UC Berkeley’s D-Lab, is designed to transform human understanding of hate speech via machine learning into a scalable tool that can be deployed on internet content to discover the scope and spread of online hate speech. Through a constantly-evolving process of machine learning, based on a protocol developed by a team of human coders as to what does and does not constitute hate speech, this tool will uncover and identify trends and patterns in hate speech across different online platforms, allowing us to push for the changes necessary to ensure that online communities are safe and inclusive spaces. We have completed the first phase of the process in making this a reality, and are eager to advance the project. Critically, the tool is able to identify individual instances of hateful and abusive speech, helping solve a problem that has been inadequately addressed through reliance on platform users to report instances of abuse and violations of terms of service agreements.

Proactive moderation of hate speech and abuse in online communities can effect substantial changes in online environments.2 A notable example is Reddit, the massively popular web forum that is comprised of around one million user-generated community boards called “subreddits.”3 Subreddits cover a wide-range of topics, from the unusual to the unsavory. While this has made the website inviting for a plethora of groups, organizations, and communities, it also made Reddit home to those with the goal of spreading racism, misogyny, anti-Semitism, homophobia, and all other forms of hate. Between June and August 2015, Reddit took the action of closing down a number of its more noxious, hate-fueled subreddits. Researchers studying the response to this action found that users who frequented the shut-down subreddits engaged in fewer instances of hate speech as they spent time on other subreddits, and that the overall use of hateful rhetoric throughout the entire website diminished as a result of banning the small number of spaces that were dedicated to and encouraging of discriminatory and hateful speech.4 More action was taken by Reddit in October 2017 with new restrictions on violent content that resulted in several pages dedicated to hateful speech or ideologies quickly banned, including National Socialism, Nazi, and Far_Right.5 Reddit has also provided a training ground for the machine learning model, which has combed through thousands of user comments in order to, with the help of human coders, learn to identify hate speech.

Online communities have been described as the modern public square, a space for opinions to be expressed and voices to be heard. In reality, though, not everyone has equal access to this public square, and not everyone has the privilege to speak without fear.6 Hateful and abusive online speech forces out other voices; excluding the voices of the marginalized and underrepresented from public discourse.7



Through combining social science and machine learning, the OHI holds the promise to bring more humanity to the internet. By helping us understand speech on the internet, the OHI will not just make online communities safer and more inclusive, it will make them more protective of speech and more welcoming to a wide array of voices.



In this document, we will outline the conceptualization and operationalization of online hate speech and the building of the machine learning model to understand it. We will also discuss the necessary techniques to make the machine learning model as accurate as possible, and some initial results, which give indications of the features of speech that are most commonly used when deciding if a reader would consider an online comment to be hate speech or not. Finally, we will discuss the way forward, and how we see the OHI scaling up and functioning in the broader online world.