Having already ruled out 3 libraries, I put my hopes on the 4th and final one: profanity-filter .

profanity-filter

profanity-filter uses Machine Learning! Sweet!

Turns out, it’s really slow. Here’s a benchmark I ran in December 2018 comparing (1) profanity-filter , (2) my library profanity-check , and (3) profanity (the one with the list of 32 words):

A human could probably do this faster than profanity-filter can

I needed to be able to perform many predictions in real time, and profanity-filter was not even close to being fast enough. But hey, maybe this is a classic tradeoff of accuracy for speed, right?

Nope.

At least profanity-filter is not dead last this time

None of the libraries I’d found on PyPI met my needs, so I built my own.

Building profanity-check, Part 1: Data

I knew that I wanted profanity-check to base its classifications on data to avoid being subjective (read: to be able to say I used Machine Learning). I put together a combined dataset from two publicly-available sources:

the “Twitter” dataset from t-davidson/hate-speech-and-offensive-language, which contains tweets scraped from Twitter.

the “Wikipedia” dataset from this Kaggle competition published by Alphabet’s Conversation AI team, which contains comments from Wikipedia’s talk page edits.

Each of these datasets contains text samples hand-labeled by humans through crowdsourcing sites like Figure Eight.

Here’s what my dataset ended up looking like:

Combined = Tweets + Wikipedia

The Twitter dataset has a column named class that’s 0 if the tweet contains hate speech, 1 if it contains offensive language, and 2 if it contains neither. I classified any tweet with a class of 2 as “Not Offensive” and all other tweets as “Offensive.” The Wikipedia dataset has several binary columns (e.g. toxic or threat ) that represent whether or not that text contains that type of toxicity. I classified any text that contained any of the types of toxicity as “Offensive” and all other texts as “Not Offensive.”

Building profanity-check, Part 2: Training

Now armed with a cleaned, combined dataset (which you can download here), I was ready to train the model!

I’m skipping over how I cleaned the dataset because, honestly, it’s pretty boring— if you’re interested in learning more about preprocessing text datasets check out this or this.

Are you also surprised the code is so short? Apparently scikit-learn does everything.

Two major steps are happening here: (1) vectorization and (2) training.

Vectorization: Bag of Words

I used scikit-learn 's CountVectorizer class, which basically turns any text string into a vector by counting how many times each given word appears. This is known as a Bag of Words (BOW) representation. For example, if the only words in the English language were the , cat , sat , and hat , a possible vectorization of the sentence the cat sat in the hat might be:

“the cat sat in the hat” -> [2, 1, 1, 1, 1]

The ??? represents any unknown word, which for this sentence is in . Any sentence can be represented in this way as counts of the , cat , sat , hat , and ??? !

A handy reference table for the next time you need to vectorize “cat cat cat cat cat”

Of course, there are far more words in the English language, so in the code above I use the fit_transform() method, which does 2 things:

Fit: learns a vocabulary by looking at all words that appear in the dataset.

learns a vocabulary by looking at all words that appear in the dataset. Transform: turns each text string in the dataset into its vector form.

Training: Linear SVM

The model I decided to use was a Linear Support Vector Machine (SVM), which is implemented by scikit-learn 's LinearSVC class. This and this are good introductions if you don’t know what SVMs are.

The CalibratedClassifierCV in the code above exists as a wrapper to give me the predict_proba() method, which returns a probability for each class instead of just a classification. You can pretty much just ignore it if that last sentence made no sense to you, though.

Here’s one (simplified) way you could think about why the Linear SVM works: during the training process, the model learns which words are “bad” and how “bad” they are because those words appear more often in offensive texts. It’s as if the training process is picking out the “bad” words for me, which is much better than using a wordlist I write myself!

A Linear SVM combines the best aspects of the other profanity detection libraries I found: it’s fast enough to run in real-time yet robust enough to handle many different kinds of profanity.

Caveats

That being said, profanity-check is far from perfect. Let me be clear: take predictions from profanity-check with a grain of salt because it makes mistakes. For example, its not good at picking up less common variants of profanities like “f4ck you” or “you b1tch” because they don’t appear often enough in the training data. You’ll never be able to detect all profanity (people will come up with new ways to evade filters), but profanity-check does a good job at finding most.

profanity-check

profanity-check is open source and available on PyPI! To use it, simply

$ pip install profanity-check