Note: I’ll be on TVO’s The Agenda with Steve Paikin tonight talking about Government 2.0.

Why does open data matter? Rather than talk in abstract terms, let me share a well documented but little known story about how open data helped expose one of the biggest tax frauds in Canada’s history.

It begins in early 2007 when a colleague was asked by a client to do an analysis of the charitable sector in Toronto. Considering it a simply consulting project, my colleague called the Canada Revenue Agency (CRA) and asked for all the 2005 T3010s – the Annual Information Returns where charities disclose to the CRA their charitable receipts and other information – in Toronto. After waiting several weeks and answering a few questions, the CRA passed along the requested information.

After spending time cleaning up the data my colleague eventually had a working excel spreadsheet and began to analyze the charitable sector in the Greater Toronto Area. One afternnon, on a lark, they decided to organize the charities by size of tax-receipted charitable donations.

At this point it is important to understand something about scale. The United Way of Greater Toronto is one of the biggest charities in North America, indeed its most recent annual charitable donation drive was the biggest on the continent. In 2008 – the year of the financial crisis started – the United Way of Greater Toronto raised $107.5 million.

So it was with some surprise that after sorting the charities by 2005 donation amounts my colleague discovered that the United Way was not first on the list. It wasn’t even second.

It was third.

This was an enormous surprise. Somewhere in Toronto, without anyone being aware of it, two charities had raised more money than the United Way (which in 2005 raised target of $96.1M). The larger one, the International Charity Association Network (ICAN) raised $248M in 2005. The other, the Choson Kallah Fund of Toronto had receipts of $120M (up from $6M in 2003).

Indeed, four out the top 15 charities on the list, including Millennium Charitable Foundation, Banyan Tree, were unknown to my colleague, someone who had been active in the Toronto charitable community for over a decade.

All told, my colleague estimated that these illegally operating charities alone sheltered roughly Half a billion dollars in 2005. Indeed, newspapers later confirmed that in 2007, fraudulent donations were closer to a billion dollars a year, with some some 3.2 billion dollars illegally sheltered, a sum that accounts for 12% of all charitable giving in Canada.

Think about this. One billion dollars. A year. That is almost .6% of the Federal Government’s annual budget.

My colleague was eager to make sure that CRA was taking action on these organizations, but it didn’t look that way. The tax frauds were still identified by CRA as qualified charities and were still soliciting donors with the endorsement of government. They knew that a call to CRA’s fraud tip line was unlikely to prompt swift action. The Toronto Star had been doing its own investigations into other instances of charity fraud and had been frustrated by CRA’s slow response.

My colleague took a different route. They gave the information to the leadership of the charitable sector and those organizations as a group took it to the leadership at CRA. From late 2007 right through 2009 the CRA charities division – now under new leadership – has systematically shut down charity tax shelters and are continuing to do so. One by one, International Charity Association Network, Banyan Tree Foundation, Choson Kallah Fund, the Millennium Charitable Foundation and others identified by my colleague have lost their charitable status. A reported $3.2 billion in tax receipts claimed by 100,000 Canadian tax filers have so far been disallowed or are being questioned. A class action suit launched by thousands of donors against the organizers and law firm of Banyan Tree Foundation was recently certified. It’s a first. Perhaps the CRA was already investigating these cases. It must build its cases carefully as, if they end up in court and fail to successfully present their case, they could help legalize a tax loophole. It may just have been moving cautiously. But perhaps it did not know.

This means that, at best, government data – information that should be made more accessible and open in an unfettered and machine readable format – helped reveal one of the largest tax evasion scandals in the country’s history. But if the CRA was already investigating, scrutiny of this data by the public served a different purpose – helping to bring these issues out into the open, forcing CRA to take public action (suspending these organizations’ right to solicit more donations), sooner rather than later. Essentially from before 2005-2007 dozens of charities were operating illegally. Had the data about their charitable receipts been available for the public’s routine review, someone in the public might have taken notice and raised a fuss earlier. Perhaps even a website tracking donations might have been launched. This would have exposed those charities that had abnormally large donations with few programs to explain then. Moreover, it might have given some of the 100,000 Canadians now being audited a tool for evaluating the charities they were giving money to.

In the computer world there is something called Linus’ Law, which states: “given enough eyeballs, all bugs (problems) are shallow.” The same could be said about many public policy or corruption issues. For many data sets, citizens should not have to make a request. Nor should we have to answer questions about why we want the data. It should be downloadable in its entirety. Not trapped behind some unhelpful search engine. When data is made readily available in machine readable formats, more eyes can look at it. This means that someone on the ground, in the community (like, say, Toronto) who knows the sector, is more likely to spot something a public servant in another city might not see because they don’t have the right context or bandwidth. And if that public servant is not allowed to talk about the issue, then they can share this information with their fellow citizens.

This is the power of open data: The power to find problems in complicated environments, and possibly even to prevent them from emerging.