Riot Games founders and League of Legends creators Brandon Beck and Marc Merrill have encountered bad behavior in massively multiplayer online games since the days of Ultima Online and EverQuest. In all that time, the typical moderator response to the all-too-common racial epithets, homophobic remarks, and bullying that borders on psychological abuse in MMOs has been to simply ban the players and move on. League of Legends definitely could have afforded to go the same route, bleeding off a few bad apples from its 12 million daily players and 32 million active monthly players (as of late 2012) without really affecting the bottom line.

But Beck and Merrill decided that simply banning toxic players wasn’t an acceptable solution for their game. Riot Games began experimenting with more constructive modes of player management through a formal player behavior initiative that actually conducts controlled experiments on its player base to see what helps reduce bad behavior. The results of that initiative have been shared at a lecture at the Massachusetts Institute of Technology and on panels at the Penny Arcade Expo East and the Game Developers Conference.

Prior to the launch of the formal initiative, Riot introduced "the Tribunal" to League of Legends in May of 2011. The Tribunal is basically a community-based court system where the defendants are players who have a large number of reports filed against them by other players. League players can log in to the Tribunal and see the cases that have been created against those players, viewing evidence in the form of sample chat logs and commentary from the players who filed the reports.

Cases in the Tribunal were evaluated independently by both player juries and staff from Riot Player Support. In over a year’s worth of cases, Riot found that the community verdict agreed with the decision of the staff moderators 80 percent of the time. The other 20 percent of the time, the players were more lenient than Riot’s staff would have been (players were never harsher than the staffers).

Riot’s takeaway from the Tribunal experiment was that League players were not only unwilling to put up with toxic behavior in their community, but they were willing to be active participants in addressing the problem. This success inspired Riot to assemble a team of staffers that would make up its formal player behavior initiative, launched just over a year ago.

Jeffrey Lin holds a PhD in Cognitive Neuroscience from the University of Washington. Before joining Riot’s player behavior team and eventually becoming lead designer of Social Systems, Lin worked at Valve Software with experimental psychologist Mike Ambinder, conducting research on games like Left 4 Dead 2 and DOTA 2. The other founding members of the player behavior team were Renjie Li, who holds a PhD in Brain and Cognitive Sciences from the University of Rochester, and Davin Pavlas, who holds a PhD in Human Factors Psychology from the University of Central Florida.

All three doctors are hardcore gamers, a necessary prerequisite for the core team members. “A big part of Riot Games in general is we want to be the most player-focused game company in the world,” Riot producer T. Carl Kwoh told Ars Technica. “Part of that player focus is really understanding that experience and living and breathing that experience.”

Before their experiments could go forward, the team had to create some sort of baseline for what constituted bad behavior in player chat rooms. This meant hand-coding thousands of chat logs and designating each line as positive, neutral, or negative. “Going through that exercise once has provided us with good data that we can rely on as far as intuition goes,” Kwoh said. The player behavior team can now categorize the nature of chat logs quickly.

With this hand-coding system in place, the Riot team conducted a little experiment with its player base by messing with the default setting for cross-team chat, which lets players broadcast a message to everyone on other teams. Players could still turn the feature on, but they had to actively go into the settings to do so.

Riot then compared the quality of the chat log in the week before and the week after the switch and noted a more than 30 percent swing from negative coded messages to positive ones. And it wasn’t just because fewer people were chatting across teams, either—overall usage of the feature stayed about the same, even after the switch.

Riot also conducted a player-centric analysis of its wealth of chat log data to try to generate an automatic profile that could separate good players from bad players. “One of the cooler things we did is we took our whole player base and categorized the players who are known for toxic behaviors, all the players who are known for positive behaviors, and we can cross-correlate all the words that both populations use,” Lin said. “Any words in common we filter out of the dictionaries. What you’re left with is a dictionary for all the words the bad players use that good players don’t use."

The dictionary of common words for bad players falls along depressingly predictable lines, with heavy weighting towards racial and homophobic slurs. The dictionary for good players, however, turned up an inspiring data point. “The first 500 or so words were real life names,” Lin said. “The best experiences are when you trust the other person because they’re a real life friend.”

This kind of chat analysis also makes it possible to predict which players are likely to run into problems in the game even before they show truly bad behavior or generate complaints. “It turns out that if you use the dictionaries, you can predict if a player will show bad behavior with up to 80 percent accuracy from just one game’s chat log,” Lin said. Of course, Riot doesn’t intend to use this kind of predictive modeling to take pre-emptive actions on players who may be heading down the path to undesirable behavior, saying that would contradict the real spirit of the company’s efforts.

“The core philosophy of the player behavior team is to work with the players and collaborate with the players. We want to build systems and tools to allow the players to hold each other accountable or to provide more social norms to online society,” Lin said.

The team’s next experiment was to add "Reform Cards" to the Tribunal system. Previously, the Tribunal could deliver bans of various lengths to players, but the banned player wouldn’t know why they were punished. Reform Cards show the punished players the same chat logs and other information that was presented during their Tribunal case and add statistics on the level of agreement among the player judges on the recommended punishment.