Merely one percent of 36,000 subreddits are to blame for 74 percent of intergroup conflicts on Reddit. This is according to a new study on Reddit “raids”—organized online spats between subreddits—from a group of Stanford University professors and scholars.

The authors sought to shed light on a longstanding question in computer science, which asks how online communities interact with each other, especially in a negative manner. Reddit data provides “a way to look at antisocial behavior in online environments [among] explicit user-defined communities,” said postdoctoral researcher Srijan Kumar, who wrote the paper alongside PhD candidate William Hamilton, associate professor Jure Leskovec, and professor Dan Jurafsky.

Reddit raids tend to cluster around communities that are controversial and popular, including r/the_donald and r/Drama, found the peer-reviewed study, which will be presented at The Web Conference in Lyon, France in April.

Using publicly available data, the team looked 137,113 interactions over a period of 40 months between 2014 and 2017. The specific interactions analyzed were “cross-links,” or any time that one subreddit linked to a post from another.

For instance, the following post was shared on r/Conspiracy in 2015, and directed users to target a post about a 9/11 film in r/Documentaries.

“Come look at all the brainwashed idiots in r/Documentaries,” said the post’s title.

“Seriously, none of those people are willing to even CONSIDER that our own country orchestrated the 9/11 attacks,” it continued. “They are all 100% certain the ‘turrists’ were behind it all, and all of the smart people who argue it are getting downvoted to the depths of hell. Damn shame. Wish people would do their research. Here's the link.”

In most cases, active community members (nicknamed “elder members” by Kumar) will initiate the attacks. Curiously, however, it’s the less active members who carry them out, like worker bees. The attacks generally look the same. Users from Group X will move to Group Y and harass the defenders in the targeted thread. Attackers almost always interact with other attackers, creating an echo chamber of thoughts and opinions.

The team utilized a psychological dictionary of “anger” words to measure the outcome of these conflicts—what types of responses elicited successful versus unsuccessful outcomes. For the purpose of the study, a successful outcome meant that aggressive behavior from Group X didn’t cause a decrease in Group Y subreddit activity. What they found was that engaging with attackers, or fighting back, more often resulted in a better outcome, perhaps by diminishing the echo chamber.

According to Kumar, the subreddits most responsible for provoking conflict are: r/subredditdrama, r/circlebroke, r/shitliberalssay, r/drama, r/conspiracy, r/bestofoutrageculture, r/hearthstone, r/shitamericanssay, r/mensrights, r/outoftheloop, r/badhistory, r/hailcorporate, r/copypasta, r/circlebroke2, r/dotamasterrace, r/nintendoswitch, r/atheism, r/negareddit, r/nostupidquestions, r/explainlikeimfive, r/the_donald, and r/anarcho_capitalism.

Some of these subreddits are known culprits of hate speech and behavior that violates Reddit’s community guidelines. The men’s rights subreddit, for example, was categorized in 2012 by the Southern Poverty Law Center, a civil rights nonprofit, as one of several online groups “dedicated to savaging feminists in particular and women.”

Conservative subreddit r/the_donald—one of the “darker corners” or Reddit, Kumar observed—has garnered a reputation for spreading conspiracy theories, shitposting, and inciting GamerGate tactics under the auspices of Donald Trump fandom.

Others, like r/nintendoswitch and r/hearthstone, are surprising to see on this list. It’s possible these communities just naturally foster heated debate among different subreddits. The study’s authors didn’t speculate about why these communities were so controversial.

The study’s authors created a deep learning model that can predict whether a cross-link is likely to instigate conflict. The LSTM (long short-term memory) model was inputted with three categories of data: which users post the link, which community the link is posted to, and what is written in the post itself. In addition to textual data, the model also incorporates social factors about users themselves—”How often a user participates in different Reddit communities, and how much they comment in communities X,Y, and Z,” Kumar said—which the authors believe makes their model a more sophisticated predictor of how a cross-link will mobilize people.

Next week, Kumar and co-author Hamilton told me, the team will be at Reddit’s San Francisco office to present their findings. It’s important to note, however, that the study only examines behavior on a thread level, and not a community-wide level.

Reddit did not respond to my questions about the company being open to outside ideas, including whether it has any intention to utilize the Stanford researchers’ predictive model. In fact, a post from Reddit CEO Steve Huffman last week, in which he wrote that r/the_donald should be left to “fall apart from their own dysfunction,” suggests the company’s top leader prefers a hands-off approach to moderating Reddit’s communities.

Next, the authors plan to investigate how the organized downvoting of posts, called brigading (which isn’t publicly available information) contributes to intergroup conflict, among other questions.