Over half of all police killings in 2015 were wrongly classified as not having been the result of interactions with officers, a new Harvard study based on Guardian data has found.

The finding is just the latest to show government databases seriously undercounting the number of people killed by police.

“Right now the data quality is bad and unacceptable,” said lead researcher Justin Feldman. “To effectively address the problem of law enforcement-related deaths, the public needs better data about who is being killed, where, and under what circumstances.”

Feldman used data from the Guardian’s 2015 investigation into police killings, The Counted, and compared it with data from the National Vital Statistics System (NVSS). That dataset, which is kept by the Centers for Disease Control and Prevention (CDC), was found to have misclassified 55.2% of all police killings, with the errors occurring disproportionately in low-income jurisdictions.

“As with any public health outcome or exposure, the only way to understand the magnitude of the problem, and whether it is getting better or worse, requires that data be uniformly, validly, and reliably obtained throughout the US,” said Nancy Krieger, professor of social epidemiology at Harvard’s Chan School of Public Health and senior author of the study. “Our results show our country is falling short of accurately monitoring deaths due to law enforcement and work is needed to remedy this problem.”

NVSS data has been collected since the late 1800s and today is responsible for, among other things, aggregating all annual US deaths. In 1949, the report added a category to capture “legal intervention” as a cause of death along with classifications like cancer, heart disease and accidents. Typically these determinations are made by local medical examiners and coroners, reported on death certificates, and submitted to the CDC.

To assess how accurately that classification was being used, the team took the 1,146 police-related deaths recorded by The Counted in 2015, removed 60 cases that did not fit the criteria of the CDC’s “legal intervention” category and requested death certificate data for the remaining 1,086 individuals. They found that a majority, 599 deaths, were classified as resulting from something other than legal intervention – principally “assault”.

Researchers found the accuracy varied wildly by state, with just 17.6% misclassification in Washington, but a startling 100% in Oklahoma.

“[Oklahoma] had more than 30 people were killed by police there in 2015 and none of them were counted on death certificates,” Feldman said.

According to the report, there were 36 cases of “legal intervention” captured in the NVSS which were not included in The Counted.

“We hope that this paper is a call to action to improve public health reporting, whether that’s following a method like the Guardian did by integrating media sources better, or by changing the policy around requiring clinicians [medical examiners and coroners] to report these deaths,” Feldman said.

Feldman also noted that this problem was law-enforcement specific. “Evidence suggests that the accuracy of mortality classification for homicide – an outcome similar to law-enforcement-related mortality … is very high”, the report reads. One 2014 study cited puts the figure at 99%.

In 2015 the Guardian launched The Counted, an interactive, crowdsourced database attempting to track police killings throughout the US. The project was intended to help remedy the lack of reliable data on police killings, a lack that became especially visible after the 2014 unrest in Ferguson put policing in the national spotlight.

Other federal databases, including the Bureau of Justice Statistics’ (BJS) arrest-related death count and the FBI’s supplementary homicide reports were similarly criticised for severely undercounting police-related deaths. Both programs have been dramatically reworked since The Counted and similar media/open source databases forced officials such as the former FBI director James Comey to admit that newspapers had more accurate data than the government on police violence.