Ben Goldacre, The Guardian, Saturday 8 May 2010

Data matters. We use it to understand what has already happened in the world, and we use it to make decisions about what to do next. But in among the graphics and electoral cock-ups lies a terrible truth: a small army of amateur enthusiasts are doing a better job of collecting and disseminating basic political data than the state has managed.

Chris Taggart blogs at CountCulture and was baffled to discover that there is no central or open record of the results from local elections in the UK. If you go to the Electoral Commission’s website, they pass the buck to the BBC, where you can find seat numbers for each area, but no record of how many votes were cast for each candidate. Plymouth University holds an unofficial database of these results, and they pay people to type every single one of them in, painstakingly and by hand. After all that they charge for access, which is perfectly understandable. So for democracy, open analysis, and public record, it might as well not exist.

“Want to look back at how people voted in your local council elections over the past 10 years?” asks Chris: “Tough. Want to compare turnout between different areas, and different periods? No can do. Want an easy way to see how close the election was last time, and how much your vote might make a difference? Forget it.”

Like so many data problems, all that’s needed is a tiny tweak: all this information is known to someone, somewhere, and it’s all been typed in, several times over, in several places, local websites, newspapers, and so on. Chris is pushing a simple solution, that is common throughout IT: a standard set of invisible tags on all local authority results webpages, so that the electoral results data can be consistently read and understood by computers, and collated for analysis by anyone who wants it. It costs nothing, it’s already compulsory for public consultation data, and Chris is making genuine headway, pushing his simple idea, to solve a huge problem, not because it’s his job, in some dismal quango, but for a laugh.

Until the StraightChoice was set up by idealistic nerds, nobody kept a record of the election materials which are distributed to the public across the country. Anyone could send them in, by simply sending an image, and Julian Todd now has an archive which political librarians would cry for, and it betrays many crimes.

There are the inevitable dodgy graphs, with parties using playfully distorted axes, and even European and local council election figures where it suited them (a Conservative leaflet in Holborn and St Pancras demotes the Lib Dems from their actual second place to third, and so on). They want a system where copies of every leaflet are formally sent to the website of the Electoral Commission, like with copyright libraries, and regulations which areenforced to forbid graphs which mislead tactical voters.

But beside the evidence of sneakiness, these volunteer projects are also generating data that provides a valuable insight into how politics works, on a par with the kinds of stuff you’d find on UKDA, the UK Data Archive for academics. StraightChoice, for example, has found a huge variation in activity, from a single leaflet in one safe Liverpool seat to 51 in the nearby marginal Liverpool Wavertree.

And what about policies? Francis Irving is one of the founders of MySociety, a charity set up to facilitate public engagement with democracy through nerdy solutions. They built TheyWorkForYou, which tells you more about parliamentary activity than Hansard, using the same dataset, but using it properly. “Wouldn’t it be nice” he asks: “to have structured data on what the candidates think on a series of local and national issues?”

Neither academics, nor parties, nor the media have achieved this: but 6000 activists around the country have worked on an incredibly complicated crowd sourcing operation built around DemocracyClub, again set up by two volunteers, Seb Bacon and Tim Green. With the help of mySociety, they populated the YourNextMP database of candidates, itself the baby of another volunteer, Edmund von der Burg. This data is now freely available, a resource for any political theorist or technically capable adolescent, right down to its rawest form.

Data is the fabric of our lives, and everywhere around us: but to be analysed, so it can generate new knowledge and understanding, it must be coralled into one place. In an ideal world, these empty frameworks would be built by national institutions: until they wake up, we have our nerds.