Rise of Wikipedia and the coming of the bots

Wikipedia launched in 2001 from the ashes of expert-penned Nupedia. When Nupedia floundered, founders Jimmy Wales and Larry Sanger pivoted to a crowdsourced encyclopedia. Within four years, the English Wikipedia had more than 750,000 articles. No longer an obscure internet experiment, it had gone mainstream.

The increased attention brought a flood of new users with all of the attendant headaches: self-promotion, amateurish additions, and outright vandalism. Wikipedia’s shortcomings, both as an information source and as a self-organizing community, were becoming apparent. In the fall of 2006, Jimmy Wales gave a keynote speech calling on Wikipedians to focus on article quality over article quantity. The site apparently responded. Over the next several months, the rate of new articles created slowed, while the cull of unworthy articles increased. Wikipedia was discovering how to manage itself.

Around the same time, it faced what was probably the first example of ongoing malicious edits. Someone began blanking pages and replacing them with an image of Squidward Tentacles, the SpongeBob Squarepants character. Using open proxies, multiple user accounts, and possibly a bot, the Squidward Vandal bedeviled Wikipedians, at one point bragging via email, "I am a computer programmer and I know all the codes in the world." He or she also claimed to be a new editor who’d gone rogue after being accused of sabotage.

"When I look at these tools, I really think that they saved Wikipedia from a sad defeat at the hands of random people."

In response, four Wikipedians built what would become known as AntiVandalBot. As the name suggests, it was a first attempt at automated vandalism protection: using a relatively simple set of rules, it monitored recent edits and intervened accordingly. Obvious vandalism could be removed automatically, while borderline cases went to another program, VandalProof, for human intervention. Crude by today’s standards, AntiVandalBot nonetheless saved editors time and attention.

It may even have saved the site. One study examined the probability that a typical Wikipedia visit between 2003 and 2006 showed a damaged article. While the chance was always minuscule — measured in thousandths — it had increased exponentially over just three years. Without the evolving anti-vandalism tools, that trend could have continued; editors would simply be overwhelmed by defacers. "When I look at these tools, I really think that they saved Wikipedia from a sad defeat at the hands of random people," says Aaron Halfaker, a PhD candidate and researcher working for the Wikimedia Foundation. By June 2006, anti-vandalism bots were widespread. (The Squidward Vandal was bested ultimately not by bots, but by sleuthy editors; similar vandalism has periodically reappeared.)

In 2007 Jacobi Carter, then a high school student, looked at MartinBot, then the latest evolution of AntiVandalBot. He saw too many false positives (benign edits being reverted as vandalism) and too much real damage slipping through. He decided he could improve on it, coding a bot that would score edits based on rules about profanity, grammatical correctness, personal attacks, and so on. Vandals often removed a lot of information or blanked pages completely; long-time editors were rarely vandals. By combining all of these rules, Carter’s program, Cluebot, became very effective. In the first two months of service it corrected over 21,000 instances of vandalism. It ran almost continuously for the next three years.

By late 2010, though, Carter was ready to work on the next generation, appropriately titled Cluebot NG. Basic heuristics had served the original bot well, testifying to the predictability of most Wikipedia vandals. But the rules caught only the most obvious vandalism, and there was plenty of room for improvement. Carter and his friend (and friendly rival) Chris Breneman began working on a totally revamped Cluebot.

The original bot relied on simple heuristics; Cluebot NG would instead employ machine learning. That meant instead of supplying basic rules and letting the software execute them, Carter and Breneman would provide a long list of edits classified as either constructive or vandalism — the same process is often used in spam filtering and intrusion detection. The key to successful machine learning is a large collection of data. Luckily, an anti-vandalism competition had already provided a dataset of about 60,000 human-categorized edits. From that, Cluebot could begin learning, finding patterns and correlations within the data.

To enable that learning, Breneman used an artificial neural network, a computational model that mimics the working of organic brains. But, says Breneman, "You can't just throw a set of English words at a neural network and expect it to figure out a pattern." Preprocessing is required: coding examples into numbers the program can understand. That’s also an opportunity for another kind of processing, called Bayesian classification, which in this case compares the edited words to those in the database. If "science," for example, tends to appear in constructive edits, the probability is higher that an unclassified edit containing "science" is also high. That’s a simple example; Cluebot uses a number of Bayesian classifications, all of which are fed into the neural network. There are about 300 total inputs leading to a single output: the probability that a given edit is vandalism. Cluebot applies a final pass of filters (checking that the page hasn’t been reverted already, that a user is on a whitelist) before taking any action.

It patrols 24 / 7, can execute more than 9000 edits per minute, and never sleeps or lets its attention flag

Next to previous incarnations, Cluebot NG is effective, controllable, and reasonably adaptive. One worry for Wikipedians is a high rate of false positives, a fear that good-faith edits will be categorized as vandalism. Being unfairly chastised for vandalism, as the Squidward Vandal claimed to have been, could turn off new editors before they have a chance to understand Wikipedia’s myriad and Byzantine rules. Cluebot allows its administrator to set the rate of false positives, though that rate can never effectively be zero. "Yes, it does get false positives," says Breneman, "but it's much better than any previous bot."

It patrols 24 / 7, never sleeping or letting its attention flag. It can execute more than 9,000 edits per minute, though it never has to approach that limit. Since 2010 it’s run almost constantly, rolling back thousands of bad edits a day; in early 2013 it topped 2 million edits. One study showed that when Cluebot NG was not operating, the time to revert vandalism nearly doubled. Malicious edits were still found, by humans, but it took almost twice as long.

That’s what Cluebot does: like all bots, it makes work more efficient. But one Slashdotter questioned whether bots fit the fundamental ethos of Wikipedia as a community-edited project. After arguing that vandalism is a subjective judgment not reducible to mathematical formulae, beakerMeep wrote, "Editing bots are wrong for Wikipedia, and if they allow it they are letting go of their vision of community participation in favor of the visions (or delusions) of grand technological solutions." Yet from a practical point of view, it’s hard to imagine today’s Wikipedia surviving without bots.

Of course, there are some vandals that only a human can catch.