We still don’t know how many people the police kill in the U.S. annually. But we’re getting closer.

In the past two months, the president and his attorney general have said they need better data on the number of people killed by police — a number that nobody knows and that no government agency can agree on. On Tuesday, the Bureau of Justice Statistics released a report that at first glance looks like it fills the gap. It found that more than a quarter of killings by police are not included in either of two federal databases. The true number of annual police killings, according to the report, is likely around 930 — about twice that of each of the two other U.S. government counts. It’s even higher — about 1,240 — if you assume that local law enforcement agencies that don’t report any killings have killed people at the same rate as agencies that do.

It’d be tidy if that were the final word — but it isn’t. On Wednesday, researchers who specialize in estimating unreported violent deaths issued a critique of the report. Based on their team’s experience in Syria, Guatemala and other conflict zones, they argued that the true number of killings is probably even higher than the new, higher range of estimates by BJS.

This is a math puzzle with real consequences. Solving it would get researchers closer to understanding how many lives have been lost — and how many victims we’re not yet counting.

Patrick Ball, co-author of the critique and executive director of the Human Rights Data Analysis Group, said it’s not the BJS’s fault that it underestimated the number. That it estimated it at all was an important step, he said. Government agencies don’t all audit their own data or undertake the difficult task of matching records that often have incomplete or missing information. So they should get some credit for that.

But to make their estimates, the report’s authors had to make some simplifying assumptions, ones that Ball said likely led to an undercount. “For sure, the estimates are too low,” Ball said in a video chat Wednesday.

Lance Couzens, a co-author of the report and a research statistician at RTI, acknowledged this and other limitations, as did the report. Couzens said in an email that the number of people the report estimates were killed by police “is probably artificially deflated.” The results, he wrote, “should not be interpreted as providing an accurate estimate.”

Here’s what the BJS tried to do and why it might not have worked well enough. It merged two databases of killings by police — its own Arrest-Related Deaths database and the Supplementary Homicide Reports maintained by the Federal Bureau of Investigation. It estimated the number of cases that were repeated in both databases. This step was important for two reasons. First, it wanted to make sure it didn’t count any cases twice. Second, the number of duplicate cases can be used to estimate their mirror image: the cases that were in neither database — the gaps in the data.

The trouble was that they had to make some assumptions to estimate the gaps. A big assumption was that the two data sets were independent — that a victim being in one database doesn’t make him or her any more or less likely to be in the other. (We’ll get into more detail on why that matters later.) But the researchers had no way to know whether that was true. Ball said it probably wasn’t. That’s been his experience when merging databases of killings. It’s a problem that can be minimized with three or more databases. But the BJS report’s authors only used two. That, Ball said, led them to arrive at too small a number. And there’s no good way to know by how much they missed.

To understand what might have gone wrong, let’s take a close look at the method the researchers used to estimate the gaps in the police killings data. The method is called capture-recapture, and it is used to count groups where traditional methods fail or are too difficult.

Here’s an example of how that would work to count wildlife, a scenario in which capture-recapture was originally used: Researchers capture a random sample of animals from a closed population — say, fish in a lake. They tag them in some way that doesn’t affect their behavior or health. Then they release the animals into the population. At some later date, the researchers collect a different random sample of animals from the same population and calculate what percentage of them are tagged. Researchers can then use two numbers — the number of animals they initially tagged and the percentage of recaptured animals that were tagged — to estimate the overall population.

The human world is messier. Researchers generally don’t tag people, and even if they did, people don’t distribute themselves randomly and don’t stay in a confined area. So researchers use messier techniques, and more data, to apply the principles of the capture-recapture method to humanity.

One way to do that is to compare more than one database. Suppose that the name, age and hometown of each person killed by police in the U.S. last year were written on separate pieces of paper and then each piece of paper was folded up and placed in the middle of a small ball. Then all the balls, representing all the people killed by police, were placed in a large barrel. Suppose we didn’t have the time and resources to count all the balls. So we try another way. First, we send an FBI analyst to the barrel. She collects 10 of the balls at random, removes the pieces of paper and makes a list with each of the victim’s information. Then she puts the balls back in the barrel and stirs them in. Now a BJS analyst approaches. He removes 10 balls randomly and takes down the information of all those victims before returning the balls.

Now the two analysts compare lists. If they have lots of overlap, there probably weren’t very many balls in the barrel to begin with. If each analyst’s list of victims is identical, there probably were only 10 victims. If five victims are on both analysts’ lists, there probably were about 20 victims. If only one victim appears on both lists, there are about 100 victims. And if there is no overlap — if each analyst’s list is entirely different from the other’s — then we can’t say much about the number of victims, although because of the way the math works out, there are probably at least 200.

That’s about as far as the BJS report took this method. But, Ball said, it’s not far enough. That’s because, to stick with our strained metaphor, it’s unlikely that each analyst is truly picking balls randomly. Not every ball has the same chance of being picked. The biggest balls, the ones most likely to be picked by analysts, might represent victims killed in states with the best reporting systems or victims whose deaths made national headlines. These balls are more likely to be on both analysts’ lists. The lists, then, will have more overlap than they would if they’d been created randomly. And that greater overlap would lead the analysts to think the overall number of balls in the barrel is smaller than the true number, the one they would have arrived at if they’d counted all the balls, even the small ones buried underneath and between the big ones.

Even in balls and barrels, researching humanity is messy. But there’s a way to make it tidier. Send a third analyst — this one, perhaps, from the Centers for Disease Control and Prevention. She picks out 10 balls, too. She’ll also be more likely to pick the large balls. That’s OK — we can use that information to get a better read on how many small balls we’re missing. And if there’s time, send a fourth and a fifth analyst, too, to collect their own balls and create their own victim databases. No finite number of analysts and databases can get us to a perfect count, but each one will get us closer. Imperfect is OK: Outside of sporting arenas and banks, few aspects of human life and death are counted perfectly. But having more than two analysts can make our ball count a lot less imperfect.

In theory, the BJS and FBI databases could be cross-referenced with a third collection, such as the independently operated KilledByPolice.net database. It compiles killings reported in the mainstream media and counts roughly 1,000 killings annually. Other government databases also include police killings, such as one from the CDC. These could be our third and fourth analysts, pulling balls out of the barrel to get us closer to the truth.

This process of adding more and more counts is called multiple systems estimation, and it’s one that Ball uses regularly — usually insisting on three or more separate counts. His research group used five databases to estimate deaths in Syria last summer.

But comparing separate counts requires counting duplicates, which requires having as much detail as possible on the individual death reports from BJS and FBI, including victims’ names and identifying information. That’s because other databases from outside the Justice Department use different sources and formats. Yet neither agency publishes that level of details on police victims.

Privacy concerns are one reason for that. Michael Planty, chief of victimization stats for the BJS, said the agency is grappling with how to release victims’ information without violating their families’ privacy, along with other legal and logistical issues. “These are the issues we’re discussing with our data stewards and general counsel,” Planty said. “It may be that we can and will release identities, but this is something we’re working through and haven’t done in the past.” FBI spokesman Stephen G. Fischer, meanwhile, said that it doesn’t collect “personally identifiable information.”

Ball said he thinks the real reason agencies don’t release the data is to protect police officers and departments. “The only privacy protected here, I think, is going to be that of the police officers,” he said.

Planty said protecting police privacy is “not necessarily the primary” concern for his agency. Some localities, he pointed out, have laws that prevent disclosure of the identity of an officer involved in a killing. The bigger concern, he said, is that police departments might be less likely to submit reports of killings by their officers “if there is a chance of disclosure.”

Even before the report came out, the BJS had suspended its police killings count and begun work to improve it. Now the agency knows the “best case scenario,” in the words of the report, is that it was missing half of deaths.

The FBI, meanwhile, has disclosed no plans to change its methods for counting police killings, even in light of the recent report. A spokesman declined to comment on that when asked by the Guardian and by me this week.

“We caveat this data — we have been for decades, cautioning individuals and organizations from drawing conclusions from it, because we recognize it is incomplete data, it is disparate data that leaves too many holes and gaps,” Stephen L. Morris, assistant director of the FBI’s criminal justice information services division, told the Guardian last month. But the newspaper pointed out that the FBI webpage containing counts of “justifiable homicide” by law enforcement didn’t include any mention of “holes and gaps” or other caveats. (It does have general caveats about its statistics elsewhere on its website.)

I asked Fischer if he had any response to the point about offering proper caveats about the data. He said, “No further comment at this time.”

While the BJS report’s new estimate of annual police deaths, 928, has already made headlines, the real significance of the report is in telling us what we — including the president and the attorney general — still don’t know. We don’t know how many people are killed by law enforcement agencies that don’t report to the databases. We don’t know how many victims aren’t in any of the databases. We don’t know the number of people who die in police custody for other reasons, such as accidents.

And those are just the missing numbers. We also don’t know names. Who are the victims in the databases? Knowing would help us understand who is being missed and why. Without better information on these numbers and names, it’s hard to know how best to go about reducing the number of people killed by police in the future.