Open this photo in gallery Illustration by Rob Dobi

How did Canada’s statistical system end up so patchy and uneven? As with many wayward children, it’s a story of nature and nurture.

The first census of the colonial population in what would become Canada was conducted in 1666 by the first intendant of New France, Jean Talon. He went door to door, recording names, ages and occupations.

Even then, government data were teaching us valuable things about our society. The census showed that of the colony’s 3,215 inhabitants of European descent, nearly two-thirds were men, prompting Talon to arrange for young, single women to come from France. (Originals of the 1666 census, ironically, don’t exist in Canada; they were sent back to France and are now housed in a branch of the National Archives.)

Story continues below advertisement

Open this photo in gallery A page from the first-ever census of settlers on Canadian soil, conducted by the intendant of New France in 1666. Library and Archives Canada

The Dominion Bureau of Statistics was founded in 1918 with about 100 employees and a narrow remit of running the census and gathering assorted numbers about our resource-heavy economy. Until then, according to the first report by the country’s first Dominion statistician, Canada’s body of statistics had “numerous gaps, often at crucial points” that was causing “embarrassment” to the young country.

But even as the bureau morphed into Statistics Canada, and Canada itself grew into a modern industrial economy, our data-gathering retained a kernel of that small-time, colonial mentality.

In the 1970s and eighties, Statscan senior analyst Craig McKie was occasionally instructed to destroy information – including such valuable stuff as the last remaining copy of the 1973 data set of highly qualified manpower – to save money on storage costs. (He refused.) “That was the mentality of Statscan: You use data and then you destroy it,” he says.

Statscan’s cavalier approach to preserving data extended to the agency’s most basic functions. Some parts of past censuses and other reports are simply gone, either because files have gone missing, been accidentally shredded or dumped, or have just physically deteriorated. These include bits of the 1961 census – not just a historical relic but a cache of valuable data for researchers looking at long-term trends. “There have been instances where data has been lost – parts of the historic censuses have been lost – and no one quite understands what happened,” says Wayne Smith, Canada’s chief statistician from 2010 to 2016.

Open this photo in gallery A blank schedule from the 1961 census. Many parts of this census record are lost.

Despite these feeble roots, Statscan learned to do many things well. The quality of its census, its ability to link data sets – such as those involving immigrants with hospital records – and its protection of confidentiality are all considered world-class. In the early 1990s, The Economist magazine twice named the agency No. 1 in a global ranking of statistical bodies.

But even during Statscan’s heyday, there was a worm in the rose: Money, or the lack of it. In 1984, as part of a cost-cutting measure, the government of Brian Mulroney announced it would axe the 1986 census. An outcry ensued, and the census was reinstated, but Statscan agreed to use its own money for the project – about $100-million. To fill that budgetary hole, the agency enhanced its cost-recovery program, dramatically increasing prices for data. The practice of charging researchers continues to this day.

“Statscan was expected to basically hunt for food,” says Stephen Gordon, an economics professor at Laval University. “They lost a habit, if they ever had it, of thinking, ‘This is public data for public information for public discourse.’ ”

Story continues below advertisement

Open this photo in gallery Stewart McInnes, federal minister responsible for Statistics Canada under the Mulroney government, poses with his daughter at his Halifax home in 1986 to promote the upcoming census. The Canadian Press

By the late 1980s, the data lockdown had gotten so bad that many researchers started simply using U.S. figures. “Our students were analyzing and citing American data, dividing by 10, and hoping they came close,” to what the situation was in Canada, says Wendy Watkins, a Carleton University sociologist who subsequently worked at Statscan as an analyst. “When I got to Statscan, I was absolutely blown away by the amount of data that we [as outside researchers] couldn’t have. It was too expensive.”

To try to fix that, she and a Statscan colleague, Ernie Boyko, the agency’s director of census operations at the time, co-founded the Data Liberation Initiative, aimed at letting postsecondary researchers access troves of data that had been locked away in the agency. It launched in 1996.

But the data wasn’t all liberated, by any means. In fact, a few years after, Statscan, working with Canadian universities and the Social Sciences and Humanities Research Council, launched a system for disclosing data that has become a glaring symbol of everything that’s wrong with Canada’s information regime: Research data centres, which require those seeking to get their hands on detailed Statscan numbers to pass a series of bureaucratic hurdles, and which Ms. Watkins calls “little data jails.”

Are there gaps in Canada’s data that you want filled?

The gaps uncovered so far

The Globe and Mail has uncovered myriad data deficits, culled from dozens of interviews, research reports, government documents, international searches and feedback from our own newsroom. Here’s a list of what we found, which we’ll be adding to as the investigation continues.