How pranks, hoaxes and manipulation undermine the reliability of Wikipedia

By Andreas Kolbe

On Reddit last week, an anonymous user said, It’s time for the truth to come out. The post, made in the AdviceAnimals subreddit and garnering over 2,700 upvotes, linked to the following memegenerator image:

Me and my friend used to make fun of an Arabic classmate called Azid. We edited the Wikipedia page for Chicken Korma so that his name would appear as an alternate name for the dish or an optional ingredient. Four years on, it has been cited by many cooking sites and publications.

It turned out that it wasn’t quite four years ago that the edit was made, but otherwise, the poster’s claims were found to be correct. A Wikipedian checking the history of the Korma article in the world’s foremost reference source traced the first insertion of the term Azid to this edit made on May 8, 2012. The change attracted no attention from other volunteer editors whatsoever, and there was no further activity in the article until over a month later.

The rise of Azid

Over time, editors apparently innocent of any involvement in the joke ensured that the spurious term Azid made it into the lead sentence of the article where it was listed as a synonym for Korma. The edit that moved the term into the lead section was made in August 2013 by a ten-year veteran of Wikipedia, an editor who has made close to 20,000 contributions to the site; in this edit, the Wikipedian added etymological detail about the word “Korma” to the article, citing no lesser authority than the Oxford English Dictionary, as well as a reference for the term Azid: a post on an amateur cookery blog named namitaskitchen.com which had copied the vandalised paragraph from Wikipedia.

This is Wikipedia in a nutshell: genuine research mixed with completely unreliable information in such a way that looking at any Wikipedia article the reader never knows what is correct and what is made up. It’s the fabled wisdom of the crowds!

The process whereby spurious information added to Wikipedia is blindly copied by other publications that are then added to the Wikipedia article as sources, cementing the spurious information in place (after all, it now has a footnote!), actually has a name: it’s called citogenesis. The term is based on an amusing – or chilling, if you care at all about education and the value of accurate knowledge – xkcd cartoon of that name.

The Reddit poster further remarked, in a discussion comment on Reddit,

For the curious, just google phrases such as ‘creamy azid’, ‘roasted azid’, ‘korma (azid)’ – etc.

Even if the Wikipedia page is corrected, the alteration will live on, cited by many cooking sites, blogs and possibly a rushed recipe book or two.

Indeed, there are now any number of websites that link the term Azid to Korma dishes, both as an alternative name and as an ingredient. And that will no doubt remain so for some time to come. (Namitaskitchen.com has since removed the erroneous information, but it can still be viewed in the Internet archive’s version of the page as it stood in 2013.)

In typical Wikipedia style, the Korma article contained a mention of the prank for a few hours, but this has now been deleted, as has the term Azid itself (although some commenters on Reddit and Wikipedia inevitably wondered whether Azid was now actually a correct alternative term for Korma, given how widespread the word had become thanks to Wikipedia).

Who invented the hair straightener?

Just how long-lived misinformation originating from a corrupted Wikipedia article can be online is well illustrated by another, similar case that occurred a few years ago. By sheer coincidence, this case also concerned a matter primarily of interest to people who do not fall into Wikipedia’s dominant demographic, i.e. young white males of European descent.

The sequence of edits that led to the rise of this particular bit of misinformation is itself instructive about the culture of Wikipedia.

On 14 August 2006, an editor who did not use a Wikipedia account, and whose posts are identified by a Washington DC IP address, edited the Wikipedia article on the hair iron, changing the name of the inventor of the hair straightener from Madam C. J. Walker to Erica Feldman. Next, the same IP address added “the poopface” after the Erica Feldman name. Then another IP (probably a classmate who’d just been told about this on Facebook!) turned the name into “Yo Mama” and added some more scatological humour, followed by “HI MARTA!!!!!!!”. There followed some image vandalism.

Along then came the first vandalism reverter. She or he took out “HI MARTA!!!!!!!”, but left the article with the Erica Feldman name and “the poopface” in place. Five days passed, during which questionable edits were made to other parts of the article. Then, another IP editor finally removed “the poopface”, and the fate of Madam C. J. Walker was sealed. One of history’s most notable African American businesswomen had been written out of history, by a Wikipedian prankster – because, all obvious traces of vandalism having been removed, the article now read like an authoritative text:

“The first hair straightener was invented by Erica Feldman using chemicals of scalp preparation and lotions to straighten the hair. Unfortunately, using this invention soon led to damaged, scorched hair from all the chemicals that were added …”

It’s a perfect illustration of Kozierok’s First Law:

“The apparent accuracy of a Wikipedia article is inversely proportional to the depth of the reader’s knowledge of the topic.”

At any rate, from now on, the internet believed that Erica Feldman invented the hair straightener. At one point, the information was even quoted in a book on hair care (the title is no longer listed in Google Books).

A year later, the kids had another go, and changed Erica Feldman into a Mr Gutgold. Again no one noticed or cared. And thus Mr Gutgold too came to be widely credited on the Internet with having invented the hair straightener. While this particular hoax was discovered and fixed on Wikipedia some five years ago, to this day, popular internet sites such as ehow.com credit Erica Feldman or Mr Gutgold (in this case, both of them!) with this particular invention.

Virtual Unreality

A book published last month, Virtual Unreality – Just because the internet told you, how do you know it’s true?, by New York University Professor of Journalism Charles Seife, describes in more depth and with a wealth of examples how easy it is for spurious information on the internet to capture people’s imagination and propagate much like a viral infection. Using epidemiology’s R 0 index as a simile (the higher R 0 , the faster a disease will spread), Seife writes,

Digital information has an unbelievably high R 0 . […] Once it escapes into the wild, it’s all but impossible to stop its spread. This is wonderful, so long as the information is correct and useful. But if it’s wrong, if it alters our brains for the worse, if it makes us make mistakes and think incorrect things, it’s a scourge.

Bad information is a disease that affects all of us – a disease that has become unbelievably potent thanks to the digital revolution.

A recent case, covered in May 2014 in The New Yorker, provides a good illustration:

In July of 2008, Dylan Breves, then a seventeen-year-old student from New York City, made a mundane edit to a Wikipedia entry on the coati. The coati, a member of the raccoon family, is “also known as … a Brazilian aardvark,” Breves wrote. He did not cite a source for this nickname, and with good reason: he had invented it. He and his brother had spotted several coatis while on a trip to the Iguaçu Falls, in Brazil, where they had mistaken them for actual aardvarks.

“I don’t necessarily like being wrong about things,” Breves told me. “So, sort of as a joke, I slipped in the ‘also known as the Brazilian aardvark’ and then forgot about it for awhile.”

Adding a private gag to a public Wikipedia page is the kind of minor vandalism that regularly takes place on the crowdsourced Web site. When Breves made the change, he assumed that someone would catch the lack of citation and flag his edit for removal.

Over time, though, something strange happened: the nickname caught on. About a year later, Breves searched online for the phrase “Brazilian aardvark.” Not only was his edit still on Wikipedia, but his search brought up hundreds of other Web sites about coatis. References to the so-called “Brazilian aardvark” have since appeared in the Independent, the Daily Mail, and even in a book published by the University of Chicago. Breves’s role in all this seems clear: a Google search for “Brazilian aardvark” will return no mentions before Breves made the edit, in July, 2008. The claim that the coati is known as a Brazilian aardvark still remains on its Wikipedia entry, only now it cites a 2010 article in the Telegraph as evidence.

After publication of the piece in The New Yorker, a Wikipedian removed the “Brazilian aardvark” moniker from the article. Following the familiar pattern, this removal was followed by an edit war about whether the Wikipedia prank should be mentioned in the article (at the time of writing, it is not).

How Wikipedia helps the spread of knowledge

Like the vast majority of these cases, the creation of an alternative history of the hair straightener never made it into the press. But there is a steadily growing corpus of documented cases of judges, doctors, politicians, writers and journalists being embarrassed by having been found to rely on Wikipedia. And bearing in mind that most of Wikipedia’s hoaxes and pranks lie unsuspected and undetected, buried in its vast bulk of crowdsourced content, this kind of public shaming is virtually the only way some small proportion of these errors are stopped from endlessly propagating further.

When composer Maurice Jarre died in 2009, the world’s press were fooled into repeating a made-up and somewhat cheesy quote which a sociology student by the name of Shane Fitzgerald had added to Jarre’s Wikipedia article. As the Associated Press reported later:

The sociology major’s made-up quote – which he added to the Wikipedia page of Maurice Jarre hours after the French composer’s death March 28 – flew straight on to dozens of U.S. blogs and newspaper Web sites in Britain, Australia and India.

They used the fabricated material, Fitzgerald said, even though administrators at the free online encyclopedia quickly caught the quote’s lack of attribution and removed it, but not quickly enough to keep some journalists from cutting and pasting it first.

A full month went by and nobody noticed the editorial fraud. So Fitzgerald told several media outlets in an e-mail and the corrections began.

“I was really shocked at the results from the experiment,” Fitzgerald, 22, said Monday in an interview a week after one newspaper at fault, The Guardian of Britain, became the first to admit its obituarist lifted material straight from Wikipedia.

“I am 100 percent convinced that if I hadn’t come forward, that quote would have gone down in history as something Maurice Jarre said, instead of something I made up,” he said. “It would have become another example where, once anything is printed enough times in the media without challenge, it becomes fact.”

In another case reported in October 2012, the Asian Football Confederation was forced to apologise to the United Arab Emirates football team after one of their writers had referred to the team as the “Sand Monkeys” – a racist slur that had been added to the Wikipedia page on the team as its purported nickname.

In December 2012, Lord Justice Leveson suffered major embarrassment when it was found that his high-profile report examining the culture, practice and ethics of the UK press named an unknown Californian student as one of the founders of the Independent newspaper. The source of the error was, of course, Wikipedia, where the misinformation had stood undiscovered for over a year.

The Glucojasinogen case illustrates that even lazy medical writers are not immune to the Wikipedia bug.

Yet despite copious evidence to the contrary, there is still no shortage of tech writers repeating the old adage that a 2005 Nature study proved that Wikipedia is as reliable as Britannica. The Nature piece in question was no rigorous scientific study, but a piece of journalism, and it focused on a very specific subset of quite specialised science articles. As the world found in early 2013, when the Bicholim conflict hoax attracted global amusement, not even arcane topics are free from interference by Wikipedia hoaxers.

There is no man behind the curtain

As mentioned in last week’s blog post, the Wikimedia Foundation’s annual revenue from donations now exceeds $50 million. Little to none of this money is spent on measuring or improving the quality of the volunteer-generated content of Wikipedia. Most of the money goes to fund additional jobs for software engineers and programmers, with many of the hires being Wikipedia volunteers apparently selected not for their software engineering expertise, but for their loyalty to the site’s leadership.

Going by spending priorities, Wikipedia looks like a social media company rather than the worldwide educational project it purports to be, with software engineering dominating expenditure. The governing mentality around Wikipedia content is much the same; when pointed to errors in Wikipedia, many contributors will shrug their shoulders and say, “So Fix It! Anyone can edit Wikipedia. If you see an error, just put it right!” The focus is on the social aspect of participation, rather than the quality of the product.

The very set-up of Wikipedia discourages and avoids responsibility, among contributors as much as in the Wikimedia Foundation. The vast majority of contributors hide all or some of their activity behind pseudonymous accounts. At heart, it all seems a bit like a game, a viewpoint that is betrayed in references to IRL (“in real life”) in Wikipedians’ discourse. Wikipedia, you see, for all its impact on the real world, is not part of real life to many of them.

Yet Wikipedia’s real-world impact is apparent from governments’ interest in manipulating the site. A year and a half ago, Wikipediocracy reported at length on efforts by the authorities in Kazakhstan to become actively involved in the development of the Kazakh language version of Wikipedia – efforts that were successful, and garnered the Kazakh Wikipedia accolades rather than condemnation from figurehead Jimmy Wales.

In 2013, the Croatian minister of education took the unprecedented step of warning the country’s pupils and students to avoid relying on the Croatian Wikipedia, as much of its content, especially on the country’s history, had been falsified by a clique of right-wing extremists.

And in the wake of the recent launch of several Twitter bots listing Wikipedia edits by various government IP addresses around the world, the Telegraph reported last week that a Russian government IP address had been found to have edited Wikipedia content related to the crash of Malaysian Airlines flight MH17, while RT (formerly Russia Today) in turn reported that someone using the IP address of the US House of Representatives had edited several Wikipedia articles on Russia. Wikipedia conveniently serves as a reputation-laundered propaganda machine.

Of course, only Wikipedia amateurs will edit through their naked IP address. Wikipedia pros use pseudonymous accounts named Rocketman12 or something like that in order to effectively hide and protect their identities, an effort that Wikipedia administrators will actively support them in. It’s part of the culture of the place.

For example, when it transpired a few years ago that a senior Wikipedia administrator and leading Wikipedia light known only as “Essjay” had misrepresented his qualifications to The New Yorker as well as to the entire Wikipedia community, passing himself off as a tenured professor of religion rather than the 24-year-old college drop-out he was, Jimmy Wales, the sole remaining co-founder of Wikipedia, could see no problem with such deception. He was happy for his Wikia for-profit company to hire Essjay and vigorously defended his protegé, saying in a video interview that is well worth viewing [see Editor’s note below],

“Even to this day I defend it. This is a young man who made a mistake. In the grand scheme of things what he did is pretty minor. Having a pseudonym, sort of fleshing it out with some traits, that’s really no big deal. I mean that’s part of online life.”

Truth does not seem to matter much in Wikipedia. It is leadership comments such as this, along with the way that Wikipedia serves as a key vector for the spread of misinformation and propaganda, that place the site squarely in the field of internet culture, rather than the field of genuine scholarship, education and enlightenment it flies on its marketing flag.

The Harvard Guide to Using Sources includes a stark warning about Wikipedia, along with documentation of yet another hoax that has become a permanent part of the internet’s knowledge of the world. Wikipedia is the world’s foremost reference source today, and it is free. It is also probably the single most unreliable foremost reference source humanity has ever had. I guess that is progress.

Editor’s Note (July 25, 2014): Shortly after publication of this blog post, the “Truth in Numbers” video containing the quoted Jimmy Wales interview was pulled from YouTube. At the time of writing, the video is still available at The Huffington Post for internet users in the US. For internet users elsewhere, the best we can offer is the Russian version of the film (with Russian voiceover added), currently available on YouTube, Vimeo and the Russian DailyTV platform. The relevant time code is around 15:00.

Image credits: memegenerator.net, Wikimedia Commons

Share this: Print

Facebook

LinkedIn

Reddit

Twitter

Tumblr

Pinterest

Pocket

Telegram

WhatsApp

Email

