A database system that will now be used by Indiana to automatically purge voter registrations that have duplicates in other states is 99 percent more likely to purge legitimate voters, according to a paper published last week by researchers from Stanford University, the University of Pennsylvania, Harvard, Yale, and Microsoft Research. Using the probability of matching birth dates for people with common first, middle, and last names and an audit of poll books from the 2012 US presidential election, the researchers concluded that the system would de-register "about 300 registrations used to cast a seemingly legitimate vote for every one registration used to cast a double vote."

The Interstate Voter Registration Crosscheck Program is a system administered by the office of Kansas Secretary of State Kris Kobach—the vice-chair of President Donald Trump's Presidential Advisory Commission on Election Integrity. Crosscheck uses voter roll data from 27 states—pulled every January by election officials and uploaded to an FTP site—to check for duplicate records across states, based on full name and date of birth, as well as the last four digits of social security numbers where that data is collected by voter registration (which is not consistent from state to state).

Indiana has used Crosscheck as an advisory system for a number of years but not to automatically purge voters. A law passed in July now allows county election officials in Indiana to de-register voters when a duplicate registration is detected. The problem with that variation in data is that it can leave room for massive error, as Sharad Goel and Houshmand Shirani-Mehr of Stanford University, Marc Meredith of the University of Pennsylvania, Michael Morse of Harvard University and Yale Law School, and David Rothschild of Microsoft Research found.

"Using data provided to Iowa in 2012," the researchers wrote, "we identified 1,483 pairings with complete SSN4 information in which both registration records were used to vote in 2012. In more than 99.5 percent of these pairings, the flagged registrations had different SSN4s, supporting our intuition that our model estimates an upper bound on the number of double votes cast in 2012." The researchers estimated that a maximum of 0.02 percent of votes cast were duplicate votes.

That number is much higher than the actual rate of fraudulent voting incidents as measured by other researchers. There were 31 incidents between 2000 and 2014 in which there was an allegation of voter fraud, according to an article by Justin Levitt, a professor at Loyola Law School in Los Angeles. NYU Law School's Brennan Center for Justice did a long-term study on voter fraud and found the rate of fraud was between 0.0003 percent and 0.0025 percent.

Even the Crosscheck program's own documentation states that duplicate votes detected may not be actual double votes because of voter check-in errors. Poll books are maintained by polling place volunteers—often with a minimal amount of training—and they or the voters themselves may make data entry errors. A 2014 "participation guide" for Crosscheck warns:

Experience in the Crosscheck program indicates that a significant number of apparent double votes are false positives and not double votes. Many are the result of errors—voters sign the wrong line in the poll book, election clerks scan the wrong line with a barcode scanner, or there is confusion over father/son voters (Sr. and Jr.).

Indiana Secretary of State Connie Lawson—also a member of the commission—acknowledged issues with Crosscheck's accuracy even as she testified before the US Congress' House Administration Committee on October 25 about Indiana's use of the program. Mother Jones' Pema Levy reported Lawson's testimony that Crosscheck was highly error-prone and that her office had to layer additional software on top of Crosscheck because it generates too many false positives.

"As it regards the Crosscheck program, we developed a software program where we have a confidence level," Lawson said. "We worked with a statistician who told us that the way we were doing our work, that I would have a better chance of winning the lottery than the counties removing the wrong person." In addition to the data used in Crosscheck, Indiana uses driver's license numbers where they are available.

But in cases where someone may have moved in from out of state in a previous year—from a state where driver's license data or SSN4 data is not used for voter registration—it's difficult to calculate how using that data will help reduce errors. Last Friday, Common Cause and the American Civil Liberties Union filed a lawsuit to stop implementation of the Indiana law, the Daily Beast reports.