The biggest story in UK Internet circles the past couple of days has been the censorship of Wikipedia by the UK’s Internet Watch Foundation (IWF). For those of you not entirely sure what it’s all about – after a tipoff the IWF blacklisted the Wikipedia article (NSFW) on the Scorpions’ 1976 album “Virgin Killer”, thanking the IWF for bringing it to his attention). The album in question contains the picture of a naked prepubescent girl (with a fake lens crack obscuring her genitalia), and was not only freely available all over the Internet (including Amazon) but has never been classified as child pornography nor has anyone been prosecuted for it.

The IWF’s blacklist is used by many major ISPs without question, and so they added wikipedia.org to a list of sites routed through their transparent proxy servers, normally use to deal with traffic aimed child pornography sites in Russia and other poorly-regulated areas of the world. The transparent proxy works in a roundabout way. With my provider, O2, rather than blocking off the entire site, it scanned all requests to wikipedia.org ; any that weren’t to http://en.wikipedia.org/wiki/Virgin_Killer it OK’ed, but any for the offending page, it would produce a fake 404 message.

This was ridiculously dumb, as it did not block the image directly, but only the container webpage – as the filter was over-precise, in practice the image was still accessible through a variety of other methods, though you had to hack. Not only was it dumb but also oppressive – by blocking the text and discussion around the image rather than the image itself, they censored all discussion of its legality, the controversy around it or accounts of the band’s reaction to it and why it was eventually pulled and replaced. Simply, this is censorship of the most despotically stupid kind.

The transparent proxy also had an unfortunate technical consquence – it limited 95% of UK access to wikipedia.org to just a few IPs. With anonymous edits this meant it was impossible to tell who was a vandal and who was not, meaning well-meaning anonymous editors were summarily blocked, and prevented from creating accounts that could allow them to edit via a login. A terrible impact on the many contributions UK-based contributors make to Wikipedia (disclosure: I am a Wikipedia administrator, but am a volunteer only and hold no post with the Wikimedia Foundation) and something that usually affects countries with strict Internet controls such as Singapore or Qatar.

Thankfully, after a considerable backlash, this evening the block was lifted, and from about 1930 this evening I’ve been able to access the page (as can much of the UK by now, no doubt). In the meantime, no doubt, thousands of people wondering what the fuss is about have looked at and downloaded the “Virgin Killer” artwork, thereby ruining the IWF’s original intention.

The IWF is a curious beast. It’s nominally an independent charity, but in practice acts with the blessing of the Home Office as an unaccountable pseudo-government censor for the Internet, against “potentially illegal online content”. This ranges from child pornography to incitement to racial hatred, though they generally concentrate on the former. Given its enormous impact, the office is surprisingly tiny. From The Guardian:

Normally the IWF, which is paid for by the EU and through a levy on the internet industry, works quietly away in its Cambridge offices. A team of four police-trained “analysts” plough through 35,000 URLs sent to them each year that are under suspicion of being obscene. That works out to an average of 700 per week, or 140 per working day, or 35 per working day per analyst – giving each an average workload for a seven-hour day of 5 URLs per hour. Typically about one-third of the URLs are deemed illegal.

So that’s four people (whose qualifications are not fully disclosed, nor how they were selected), are responsible for what 95% of the UK online population cannot see, after a typical review of just 12 minutes. Their decisions are implemented through BT’s Cleanfeed system, used with almost blind obedience by ISPs. The blacklist cannot be legitimately seen or reviewed by anyone outside of the IWF (although ironically, Cleanfeed’s architecture may open a backdoor to the blacklist) and site administrators are not notified, so unless you find your own site there by accident (as in the Wikipedia case) then you’ll never know. And the appeals process is conducted via the IWF, without recourse to a independent authority or means of oversight. The Wikipedia appeal, incidentally, was the first to be brought in the IWF’s history.

The IWF/Cleanfeed system of judge, jury & executioner is obviously broken, both technologically and socially. And in an ideal world a system such as the IWF would not exist, but it’s clear that it’s either that or full-on government regulation. Which to be honest, probably wouldn’t look that different from the current system anyway. So in the ugly world we live in, how can we make the current sociotechnical system of censorship less broken and as minimal as possible?

For starters, we have no way of knowing what we see is actually being censored. The blank “404 error page” for blocked sites breaks HTTP status code conventions – although there is no HTTP code for censorship. While some sites such as Google will be good enough to mention if their search results have been censored thanks to the IWF or copyright takedowns under the DMCA, in co-operation with the Chilling Effects project (e.g.). Why was there such reluctance for provision of a notice that says to the effect of “We have been advised by the IWF that this page contains illegal material and has been blocked.”? Was it because telling people what they’re reading is subject to censorship would have looked bad? Surely however that’s not as much a PR disaster as this?

As well as this, Cleanfeed broke Wikipedia by munging IPs and forcing people through a few proxies – not only damaging Wikipedia but any other site that uses source IP checks as part of its efforts to prevent session or identity hijacking. It also failed to filter alternative URLs, or to combat one of Wikipedia’s most pervasive legacies – the free licensing which means articles can also be reproduced on mirrors such as wapedia or Answers.com. In short, it was horribly ineffective while breaking a lot of conventions and testing the goodwill of one of the largest and most open-minded online projects around.

From the more social side, IWF’s approach is broken as well. The blacklist is secret – with at least one good reason – it prevents paedophiles from easily finding a whole new list of sites to bookmark. But this blacklist secrecy lends problems of accountability. It’s a tricky problem – I for one see little gain in opening the list to all, but there’s nothing stopping them publicly releasing one-way hashes of the URLs, say, so researchers and webmasters can check to see if any given site is on it.

The blacklist should definitely be open to an oversight committee, independent of the IWF – specialist police officers, civil servants, lawyers specialising in cases such as this, for their oversight, and these same people should handle appeals to be taken off the list as well, rather than the IWF. An independent board of technologists meanwhile should be tasked with overseeing implementation, testing its robustness independently, and to make sure that blocking and site redirection are properly dealt with according to established RFCs, and not by fake 404 messages.

As for the people who make the decisions, four people in an office with some police training doesn’t sound enough given the impact of the censorship they are inflicting. A review of how many staff there are and what level of training they get, and whether they need to be supplemented by, say, senior police officers. A clearer mandate on the material they should be looking for and blocking is required – the “potentially illegal” is too fuzzy and the censorship of text surrounding images utterly misguided.

This fuzzy remit should be especially borne in mind given their other work on “incitement to racial hatred”, and from January 2009, what has been termed “extreme” pornography. If the IWF are going to be as clueless about content on Wikipedia then I have no confidence in their ability to deal with these new (and untested) laws when they come into place.

Finally a note: the IWF are not evil. Child pornography isn’t just illegal but morally wrong, and their intentions are noble, even if we know what kind of road those intentions can pave. Preventing the spread of child pornography online, particularly when the perpetrators and distributors are beyond usual remits, is not an easy nor a thankful task. But letting them act unilaterally can lead to damaging consequences, as we’ve just seen. Good intentions must be backed up with independent cynical controls, or they are no use at all.

Further reading (updated 10/12): The Open Rights Group has a good post summarising with some questions of their own and there’s a couple of good posts over at Septicisle – which also points out the IWF is the one that has pushed for the “Girls Aloud” case to be prosecuted, the first obscenity case covering fictional text in over two decades, which opens another can of worms.