On Friday, I published a moderately unkind comment about Reddit; I said that while I love Reddit warts and all, its content is 80 to 90 percent dross. Mostly, that referred to the atheism subreddit (or r/atheism, in Reddit-speak). I visit the place five or six days a week, and there is always a lot of junk to wade through: from rickety, poorly-worded arguments to repetitions of old news stories that had already been posted days or weeks earlier. Also, of course, there’s a deluge of low-quality memes.

While the dreck is annoying on some level, I fully accept that it’s part of the give-and-take of online communities without gatekeepers. Anyone can join, and you never know if the next link or comment you read was left by someone with decent thinking/writing skills, or by a dim 13-year-old crudely rebelling against mom and dad saying grace over dinner. Like this specimen, say.

Both have a place at the Reddit table — but it’s a damn huge table and the dumb noise is cacophonous.

That’s all by way of introducing you to an interesting brand new data set about Reddit, provided by Idibon, a business outfit that “helps companies understand their language data.” People who work at Idibon have backgrounds in fields like computational linguistics, psycholinguistics, and natural language processing. For their latest project, they’ve applied those skills to hundreds of the most popular subreddits, in order to assess the overall bigotry and toxicity of active Redditors associated with each one.

By that metric, r/atheism is among the worst, Idibon claims — in third place, right after r/theredpill (a men’s-rights subreddit) and r/opieandanthony (dedicated to the radio shock jocks).

On the other end of the scale, no doubt confounding expectations, is r/libertarian. It is the best-scoring Reddit community for absence of bigotry; the number of r/libertarian comments labeled as bigoted by Idibon’s annotators was actually negative, meaning that “despite having bigoted comments present,” those comments “were rejected by the community as a whole.”

But how could the analysts reliably draw these conclusions?

Comments were pulled via the Reddit API from the top 250 subreddits by number of subscribers, in addition to any subreddit mentioned in [this] AskReddit thread with over 150 upvotes. Comments were pulled from articles on the front page of each subreddit, 1,000 comments were randomly chosen from each subreddit for analysis, and any subreddit that had fewer than 1000 comments was excluded from the analysis.

The company then assessed the data like this:

Toxic comments are ones that would make someone who disagrees with the viewpoint of the commenter feel uncomfortable and less likely to want to participate in that Reddit community. To be more specific, we defined a comment as Toxic if it met either of the following criteria: 1. Ad hominem attack: a comment that directly attacks another Redditor (e.g. “your mother was a hamster and your father smelt of elderberries”) or otherwise shows contempt/disagrees in a completely non-constructive manner (e.g. “GASP are they trying CENSOR your FREE SPEECH??? I weep for you /s”) 2. Overt bigotry: the use of bigoted (racist/sexist/homophobic etc.) language, whether targeting any particular individual or more generally, which would make members of the referenced group feel highly uncomfortable However, the problem with only measuring Toxic comments is it biases against subreddits that simply tend to be more polarizing and evoke more emotional responses generally. In order to account for this, we also measured Supportiveness in comments — defined as language that is directly addressing another Redditor in a supportive (e.g. “We’re rooting for you!”) or appreciative (e.g. “Thanks for the awesome post!”) manner. By measuring both Toxicity and Supportiveness we are able to get a holistic view of community health that can be used to more fairly compare and contrast subreddit communities.

For further background on Idibon’s methodology and results, go here.

I’m inclined to somewhat cushion the bad score for r/atheism in the observation that minorities such as atheists have a legitimate need to (sometimes) verbally slam their fist into the wall and blow off steam. We like to think that we’re amongst ourselves, and that our peers will understand, but of course that’s not necessarily how it works on the wide-open Internet. What we say is read and noted by outsiders too. Fairly or not, our expressions will shape their opinions of us as a group.

Maybe Idibon’s data will serve as a broad reminder that passionate discussions are good — and screaming invective, not so much.

=-=-=-=-=-=-=-=-=-=-=-=-=-=

P.S.: At Camels With Hammers, Dan Fincke made a related point last month.

I have cleaned up my Facebook feed drastically in the last couple years. I no longer hear much from anti-theists like the one in October who said “I’d like to smack that idiot in the mouth with a brick” when said “idiot” had done nothing more than reply to a cute owl video with the remark “God has given us so many wonderful creatures to enjoy”. That was the day I ended my friendship with the guy who liked to celebrate his violent feelings towards religious people. I don’t believe in being friends with or making excuses for bigots and bullies. I don’t care whether atheists or any other group is marginalized, I don’t just turn a blind eye when they use dehumanizing and abusive rhetoric about other people. I don’t let people get away with calling themselves skeptics while promoting a black and white picture of the world where their own tribe can do no wrong.



